000151511 001__ 151511 000151511 005__ 20250411150809.0 000151511 0247_ $$2doi$$a10.1109/TC.2025.3533086 000151511 0248_ $$2sideral$$a143118 000151511 037__ $$aART-2025-143118 000151511 041__ $$aeng 000151511 100__ $$aNavarro-Torres, Agustín 000151511 245__ $$aA complexity-effective local delta prefetcher 000151511 260__ $$c2025 000151511 5060_ $$aAccess copy available to the general public$$fUnrestricted 000151511 5203_ $$aData prefetching is crucial for performance in modern processors by effectively masking long-latency memory accesses. Over the past decades, numerous data prefetching mechanisms have been proposed, which have continuously reduced the access latency to the memory hierarchy. Several state-of-the-art prefetchers, namely Instruction Pointer Classifier Prefetcher (IPCP) and Berti, target the first-level data cache, and thus, they are able to completely hide the miss latency for timely prefetched cache lines. Berti exploits timely local deltas to achieve high accuracy and performance. This paper extends Berti with a larger evaluation and with extra optimizations on top of the previous conference paper. The result is a complexity-effective version of Berti that outperforms it for a large amount of workloads and simplifies its control logic. The key for those advancements is a simple mechanism for learning timely deltas without the need to track the fetch latency of each cache miss. Our experiments conducted with a wide range of workloads (CVP traces by Qualcomm, SPEC CPU2017, and GAP) show performance improvements by 4.0% over a mainstream stride prefetcher, and by a non-negligible 1.4% over the previously published version of Berti requiring similar storage. 000151511 536__ $$9info:eu-repo/grantAgreement/ES/AEI-FEDER/RTI2018-098156-B-C53$$9info:eu-repo/grantAgreement/ES/AEI/PID2022-136315OB-I00$$9info:eu-repo/grantAgreement/ES/AEI/TED2021-130233B-C33$$9info:eu-repo/grantAgreement/ES/DGA/T58_23R$$9info:eu-repo/grantAgreement/EC/H2020/101158023/EU/Energy-Efficient Highly Accurate Data Prefetching/Berti-Chip$$9This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No H2020 101158023-Berti-Chip$$9info:eu-repo/grantAgreement/EC/H2020/819134/EU/Extending Coherence for Hardware-Driven Optimizations in Multicore Architectures/ECHO$$9This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No H2020 819134-ECHO$$9info:eu-repo/grantAgreement/ES/MICINN/PID2022-136454NB-C22 000151511 540__ $$9info:eu-repo/semantics/openAccess$$aAll rights reserved$$uhttp://www.europeana.eu/rights/rr-f/ 000151511 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion 000151511 700__ $$aPanda, Biswabandan 000151511 700__ $$0(orcid)0000-0003-4164-5078$$aAlastruey-Benedé, Jesús$$uUniversidad de Zaragoza 000151511 700__ $$0(orcid)0000-0002-5916-7898$$aIbáñez, Pablo$$uUniversidad de Zaragoza 000151511 700__ $$0(orcid)0000-0002-5976-1352$$aViñals-Yufera, Víctor$$uUniversidad de Zaragoza 000151511 700__ $$aRos, Alberto 000151511 7102_ $$15007$$2035$$aUniversidad de Zaragoza$$bDpto. Informát.Ingenie.Sistms.$$cÁrea Arquit.Tecnología Comput. 000151511 773__ $$g74, 5 (2025), 1482-1494$$pIEEE trans. comput.$$tIEEE TRANSACTIONS ON COMPUTERS$$x0018-9340 000151511 8564_ $$s2749827$$uhttps://zaguan.unizar.es/record/151511/files/texto_completo.pdf$$yVersión publicada 000151511 8564_ $$s3311056$$uhttps://zaguan.unizar.es/record/151511/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada 000151511 909CO $$ooai:zaguan.unizar.es:151511$$particulos$$pdriver 000151511 951__ $$a2025-04-11-15:05:16 000151511 980__ $$aARTICLE