Improving GPU cache hierarchy performance with a fetch and replacement cache

Candel, Francisco; Valero, Alejandro; Petit, Salvador; Sahuquillo, Julio

doi:10.1007/978-3-319-96983-1_17

Improving GPU cache hierarchy performance with a fetch and replacement cache

Candel, Francisco ; Petit, Salvador ; Valero, Alejandro (Universidad de Zaragoza) ; Sahuquillo, Julio

Resumen: In the last few years, GPGPU computing has become one of the most popular computing paradigms in high-performance computers due to its excellent performance to power ratio. The memory requirements of GPGPU applications widely differ from the requirements of CPU counterparts. The amount of memory accesses is several orders of magnitude higher in GPU applications than in CPU applications, and they present disparate access patterns. Because of this fact, large and highly associative Last-Level Caches (LLCs) bring much lower performance gains in GPUs than in CPUs. This paper presents a novel approach to manage LLC misses that efficiently improves LLC hit ratio, memory-level parallelism, and miss latencies in GPU systems. The proposed approach leverages a small additional Fetch and Replacement Cache (FRC) that stores control and coherence information of incoming blocks until they are fetched from main memory. Then, fetched blocks are swapped with victim blocks to be replaced in the LLC. After that, the eviction of victim blocks is performed from the FRC. This management approach improves performance due to three main reasons: (i) the lifetime of blocks being replaced is increased, (ii) the main memory path is unclogged on long bursts of LLC misses, and (iii) the average L2 miss delaying latency is reduced. Experimental results show that our proposal increases the performance (OPC) over 25% in most of the studied applications, reaching improvements up to 150% in some applications.
Idioma: Inglés
DOI: 10.1007/978-3-319-96983-1_17
Año: 2018
Publicado en: Lecture Notes in Computer Science 11014 LNCS (2018), 235-248 [13 pp.]
ISSN: 0302-9743
Factor impacto SCIMAGO: 0.283 - Theoretical Computer Science (Q2) - Computer Science (miscellaneous) (Q2)

Financiación: info:eu-repo/grantAgreement/ES/MINECO/TIN2015-66972-C5-1-R
Financiación: info:eu-repo/grantAgreement/ES/MINECO/TIN2016-76635-C2-1-R
Tipo y forma: Article (PostPrint)
Área (Departamento): Área Arquit.Tecnología Comput. (Dpto. Informát.Ingenie.Sistms.)

Permalink:

Visitas y descargas

Este artículo se encuentra en las siguientes colecciones:
Articles > Artículos por área > Arquitectura y Tecnología de Computadores

Back to search

Record created 2019-11-05, last modified 2020-01-17

Postprint:
PDF

Rate this document:

(Not yet reviewed)

Add to personal basket
Export as BibTeX, MARC, MARCXML, DC, EndNote, NLM, RefWorks

Universidad de Zaragoza Repository

Improving GPU cache hierarchy performance with a fetch and replacement cache