000106589 001__ 106589 000106589 005__ 20210902121933.0 000106589 0247_ $$2doi$$a10.1109/TVLSI.2020.3005451 000106589 0248_ $$2sideral$$a120311 000106589 037__ $$aART-2020-120311 000106589 041__ $$aeng 000106589 100__ $$0(orcid)0000-0002-7057-4283$$aAlcolea Moreno, A.$$uUniversidad de Zaragoza 000106589 245__ $$aAnalysis of a Pipelined Architecture for Sparse DNNs on Embedded Systems 000106589 260__ $$c2020 000106589 5060_ $$aAccess copy available to the general public$$fUnrestricted 000106589 5203_ $$aDeep neural networks (DNNs) are increasing their presence in a wide range of applications, and their computationally intensive and memory-demanding nature poses challenges, especially for embedded systems. Pruning techniques turn DNN models into sparse by setting most weights to zero, offering optimization opportunities if specific support is included. We propose a novel pipelined architecture for DNNs that avoids all useless operations during the inference process. It has been implemented in a field-programmable gate array (FPGA), and the performance, energy efficiency, and area have been characterized. Exploiting sparsity yields remarkable speedups but also produces area overheads. We have evaluated this tradeoff in order to identify in which scenarios it is better to use that area to exploit sparsity, or to include more computational resources in a conventional DNN architecture. We have also explored different arithmetic bitwidths. Our sparse architecture is clearly superior on 32-bit arithmetic or highly sparse networks. However, on 8-bit arithmetic or networks with low sparsity it is more profitable to deploy a dense architecture with more arithmetic resources than including support for sparsity. We consider that FPGAs are the natural target for DNN sparse accelerators since they can be loaded at run-time with the best-fitting accelerator. 000106589 536__ $$9info:eu-repo/grantAgreement/ES/DGA/Research Groups-European Social Found$$9info:eu-repo/grantAgreement/ES/MINECO-AEI-ERDF/CICYT-PID2019-105660RB-C21$$9info:eu-repo/grantAgreement/ES/MINECO-AEI-ERDF/TIN2016-76635-C2-1-R$$9info:eu-repo/grantAgreement/ES/MINECO-AEI-ERDF/TIN2017-87237-9 000106589 540__ $$9info:eu-repo/semantics/openAccess$$aAll rights reserved$$uhttp://www.europeana.eu/rights/rr-f/ 000106589 590__ $$a2.312$$b2020 000106589 591__ $$aENGINEERING, ELECTRICAL & ELECTRONIC$$b152 / 273 = 0.557$$c2020$$dQ3$$eT2 000106589 591__ $$aCOMPUTER SCIENCE, HARDWARE & ARCHITECTURE$$b27 / 53 = 0.509$$c2020$$dQ3$$eT2 000106589 592__ $$a0.506$$b2020 000106589 593__ $$aElectrical and Electronic Engineering$$c2020$$dQ2 000106589 593__ $$aSoftware$$c2020$$dQ2 000106589 593__ $$aHardware and Architecture$$c2020$$dQ2 000106589 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/acceptedVersion 000106589 700__ $$0(orcid)0000-0002-7752-8714$$aOlivito, J.$$uUniversidad de Zaragoza 000106589 700__ $$0(orcid)0000-0002-7532-2720$$aResano, J.$$uUniversidad de Zaragoza 000106589 700__ $$aMecha, H. 000106589 7102_ $$15007$$2035$$aUniversidad de Zaragoza$$bDpto. Informát.Ingenie.Sistms.$$cÁrea Arquit.Tecnología Comput. 000106589 773__ $$g28, 9 (2020), 1993-2003$$pIEEE trans. very large scale integr. (VLSI) syst.$$tIEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS$$x1063-8210 000106589 8564_ $$s2233166$$uhttps://zaguan.unizar.es/record/106589/files/texto_completo.pdf$$yPostprint 000106589 8564_ $$s3722241$$uhttps://zaguan.unizar.es/record/106589/files/texto_completo.jpg?subformat=icon$$xicon$$yPostprint 000106589 909CO $$ooai:zaguan.unizar.es:106589$$particulos$$pdriver 000106589 951__ $$a2021-09-02-10:56:45 000106589 980__ $$aARTICLE