Resumen: Voltage scaling to values near the threshold voltage is a promising technique to hold off the many-core power wall. However, as voltage decreases, some SRAM cells are unable to operate reliably and show a behavior consistent with a hard fault. Block disabling is a micro-architectural technique that allows low-voltage operation by deactivating faulty cache entries, at the expense of reducing the effective cache capacity. In the case of the last-level cache, this capacity reduction leads to an increase in off-chip memory accesses, diminishing the overall energy benefit of reducing the voltage supply. In this work, we exploit the reuse locality and the intrinsic redundancy of multi-level inclusive hierarchies to enhance the performance of block disabling with negligible cost. The proposed fault-aware last-level cache management policy maps critical blocks, those not present in private caches and with a higher probability of being reused, to active cache entries. Our evaluation shows that this fault-aware management results in up to 37.3% and 54.2% fewer misses per kilo instruction (MPKI) than block disabling for multiprogrammed and parallel workloads, respectively. This translates to performance enhancements of up to 13% and 34.6% for multiprogrammed and parallel workloads, respectively. Idioma: Inglés DOI: 10.1016/j.jpdc.2018.10.010 Año: 2019 Publicado en: JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 125 (2019), 31-44 ISSN: 0743-7315 Factor impacto JCR: 2.296 (2019) Categ. JCR: COMPUTER SCIENCE, THEORY & METHODS rank: 35 / 108 = 0.324 (2019) - Q2 - T1 Factor impacto SCIMAGO: 0.525 - Artificial Intelligence (Q2) - Computer Networks and Communications (Q2) - Hardware and Architecture (Q2) - Software (Q2) - Theoretical Computer Science (Q3)