86322 20230914083238.0 doi 10.3390/app9163295 sideral 114978 ART-2019-114978 eng Mingote, Victoria Universidad de Zaragoza (orcid)0000-0002-3505-0249 Supervector extraction for encoding speaker and phrase information with neural networks for text-dependent speaker verification 2019 Access copy available to the general public Unrestricted In this paper, we propose a new differentiable neural network with an alignment mechanism for text-dependent speaker verification. Unlike previous works, we do not extract the embedding of an utterance from the global average pooling of the temporal dimension. Our system replaces this reduction mechanism by a phonetic phrase alignment model to keep the temporal structure of each phrase since the phonetic information is relevant in the verification task. Moreover, we can apply a convolutional neural network as front-end, and, thanks to the alignment process being differentiable, we can train the network to produce a supervector for each utterance that will be discriminative to the speaker and the phrase simultaneously. This choice has the advantage that the supervector encodes the phrase and speaker information providing good performance in text-dependent speaker verification tasks. The verification process is performed using a basic similarity metric. The new model using alignment to produce supervectors was evaluated on the RSR2015-Part I database, providing competitive results compared to similar size networks that make use of the global average pooling to extract embeddings. Furthermore, we also evaluated this proposal on the RSR2015-Part II. To our knowledge, this system achieves the best published results obtained on this second part. info:eu-repo/grantAgreement/ES/DGA-FEDER/T36-17R info:eu-repo/grantAgreement/ES/MINECO/TIN2017-85854-C4-1-R info:eu-repo/semantics/openAccess by http://creativecommons.org/licenses/by/3.0/es/ 2.474 2019 PHYSICS, APPLIED 62 / 154 = 0.403 2019 Q2 T2 ENGINEERING, MULTIDISCIPLINARY 32 / 91 = 0.352 2019 Q2 T2 CHEMISTRY, MULTIDISCIPLINARY 88 / 176 = 0.5 2019 Q2 T2 MATERIALS SCIENCE, MULTIDISCIPLINARY 161 / 314 = 0.513 2019 Q3 T2 0.418 2019 Engineering (miscellaneous) 2019 Q1 Fluid Flow and Transfer Processes 2019 Q2 Process Chemistry and Technology 2019 Q2 Instrumentation 2019 Q2 Materials Science (miscellaneous) 2019 Q2 Computer Science Applications 2019 Q3 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Miguel, Antonio Universidad de Zaragoza (orcid)0000-0001-5803-4316 Ortega, Alfonso Universidad de Zaragoza (orcid)0000-0002-3886-7748 Lleida, Eduardo Universidad de Zaragoza (orcid)0000-0001-9137-4013 5008 800 Universidad de Zaragoza Dpto. Ingeniería Electrón.Com. Área Teoría Señal y Comunicac. 9, 16 (2019), 3295 [12 pp.] Appl. sci. Applied Sciences (Switzerland) 2076-3417 1144341 http://zaguan.unizar.es/record/86322/files/texto_completo.pdf Versión publicada 108636 http://zaguan.unizar.es/record/86322/files/texto_completo.jpg?subformat=icon icon Versión publicada oai:zaguan.unizar.es:86322 articulos driver 2023-09-13-10:46:50 ARTICLE