Compression-Aware and Performance-Efficient Insertion Policies for Long-Lasting Hybrid LLCs

Escuin, Carlos; Ibáñez, Pablo; Ali Khan, Asif; Monreal, Teresa Castrillon, Jeronimo; Viñals, Víctor
doi:10.1109/HPCA56546.2023.10070968
000166053 001__ 166053
000166053 005__ 20260119170325.0
000166053 0247_ $$2doi$$a10.1109/HPCA56546.2023.10070968
000166053 0248_ $$2sideral$$a147506
000166053 037__ $$aART-2023-147506
000166053 041__ $$aeng
000166053 100__ $$aEscuin, Carlos$$uUniversidad de Zaragoza
000166053 245__ $$aCompression-Aware and Performance-Efficient Insertion Policies for Long-Lasting Hybrid LLCs
000166053 260__ $$c2023
000166053 5203_ $$aEmerging non-volatile memory (NVM) technologies can potentially replace large SRAM memories such as the last-level cache (LLC). However, despite recent advances, NVMs suffer from higher write latency and limited write endurance. Recently, NVM-SRAM hybrid LLCs are proposed to combine the best of both worlds. Several policies have been proposed to improve the performance and lifetime of hybrid LLCs by intelligently steering the incoming LLC blocks into either the SRAM or NVM part, regarding the cache behavior of the LLC blocks and the SRAM/NVM device properties. However, these policies neither consider compressing the contents of the cache block nor using partially worn-out NVM cache blocks.This paper proposes new insertion policies for byte-level fault-tolerant hybrid LLCs that collaboratively optimize for lifetime and performance. Specifically, we leverage data compression to utilize partially defective NVM cache entries, thereby improving the LLC hit rate. The key to our approach is to guide the insertion policy by both the reuse properties of the block and the size resulting from its compression. A block is inserted in NVM only if it is a read-reuse block or its compressed size is lower than a threshold. It will be inserted in SRAM if the block is a write-reuse or its compressed size is greater than the threshold. We use set-dueling to tune the compression threshold at runtime. This compression threshold provides a knob to control the NVM write rate and, together with a rule-based mechanism, allows balancing performance and lifetime.Overall, our evaluation shows that, with affordable hardware overheads, the proposed schemes can nearly reach the performance of an SRAM cache with the same associativity while improving lifetime by 17× compared to a hybrid NVM-unaware LLC. Our proposed scheme outperforms the state-of-the-art insertion policies by 9% while achieving a comparative lifetime. The rule-based mechanism shows that by compromising, for instance, 1.1% and 1.9%...
000166053 540__ $$9info:eu-repo/semantics/closedAccess$$aAll rights reserved$$uhttp://www.europeana.eu/rights/rr-f/
000166053 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000166053 700__ $$aAli Khan, Asif
000166053 700__ $$0(orcid)0000-0002-5916-7898$$aIbáñez, Pablo$$uUniversidad de Zaragoza
000166053 700__ $$aMonreal, Teresa Castrillon, Jeronimo
000166053 700__ $$0(orcid)0000-0002-5976-1352$$aViñals, Víctor$$uUniversidad de Zaragoza
000166053 7102_ $$15007$$2035$$aUniversidad de Zaragoza$$bDpto. Informát.Ingenie.Sistms.$$cÁrea Arquit.Tecnología Comput.
000166053 773__ $$g(2023), [14 pp.]$$tProceedings - International Symposium on High-Performance Computer Architecture$$x1530-0897
000166053 8564_ $$s641096$$uhttps://zaguan.unizar.es/record/166053/files/texto_completo.pdf$$yVersión publicada
000166053 8564_ $$s3684750$$uhttps://zaguan.unizar.es/record/166053/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000166053 909CO $$ooai:zaguan.unizar.es:166053$$particulos$$pdriver
000166053 951__ $$a2026-01-19-14:39:03
000166053 980__ $$aARTICLE
Universidad de Zaragoza Repository