000084697 001__ 84697
000084697 005__ 20230914083238.0
000084697 0247_ $$2doi$$a10.3390/app9183697
000084697 0248_ $$2sideral$$a114013
000084697 037__ $$aART-2019-114013
000084697 041__ $$aeng
000084697 100__ $$0(orcid)0000-0003-1772-0605$$aViñals, Ignacio$$uUniversidad de Zaragoza
000084697 245__ $$aAn analysis of the short utterance problem for speaker characterization
000084697 260__ $$c2019
000084697 5060_ $$aAccess copy available to the general public$$fUnrestricted
000084697 5203_ $$aSpeaker characterization has always been conditioned by the length of the evaluated utterances. Despite performing well with large amounts of audio, significant degradations in performance are obtained when short utterances are considered. In this work we present an analysis of the short utterance problem providing an alternative point of view. From our perspective the performance in the evaluation of short utterances is highly influenced by the phonetic similarity between enrollment and test utterances. Both enrollment and test should contain similar phonemes to properly discriminate, being degraded otherwise. In this study we also interpret short utterances as incomplete long utterances where some acoustic units are either unbalanced or just missing. These missing units are responsible for the speaker representations to be unreliable. These unreliable representations are biased with respect to the reference counterparts, obtained from long utterances. These undesired shifts increase the intra-speaker variability, causing a significant loss of performance. According to our experiments, short utterances (3-60 s) can perform as accurate as if long utterances were involved by just reassuring the phonetic distributions. This analysis is determined by the current embedding extraction approach, based on the accumulation of local short-time information. Thus it is applicable to most of the state-of-the-art embeddings, including traditional i-vectors and Deep Neural Network (DNN) xvectors.
000084697 536__ $$9info:eu-repo/grantAgreement/ES/DGA-FEDER/T36-17R$$9info:eu-repo/grantAgreement/ES/DGA-FEDER/2014-2020$$9info:eu-repo/grantAgreement/ES/MINECO/TIN2017-85854-C4-1-R
000084697 540__ $$9info:eu-repo/semantics/openAccess$$aby$$uhttp://creativecommons.org/licenses/by/3.0/es/
000084697 590__ $$a2.474$$b2019
000084697 591__ $$aPHYSICS, APPLIED$$b62 / 154 = 0.403$$c2019$$dQ2$$eT2
000084697 591__ $$aENGINEERING, MULTIDISCIPLINARY$$b32 / 91 = 0.352$$c2019$$dQ2$$eT2
000084697 591__ $$aCHEMISTRY, MULTIDISCIPLINARY$$b88 / 176 = 0.5$$c2019$$dQ2$$eT2
000084697 591__ $$aMATERIALS SCIENCE, MULTIDISCIPLINARY$$b161 / 314 = 0.513$$c2019$$dQ3$$eT2
000084697 592__ $$a0.418$$b2019
000084697 593__ $$aEngineering (miscellaneous)$$c2019$$dQ1
000084697 593__ $$aFluid Flow and Transfer Processes$$c2019$$dQ2
000084697 593__ $$aProcess Chemistry and Technology$$c2019$$dQ2
000084697 593__ $$aInstrumentation$$c2019$$dQ2
000084697 593__ $$aMaterials Science (miscellaneous)$$c2019$$dQ2
000084697 593__ $$aComputer Science Applications$$c2019$$dQ3
000084697 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000084697 700__ $$0(orcid)0000-0002-3886-7748$$aOrtega, Alfonso$$uUniversidad de Zaragoza
000084697 700__ $$0(orcid)0000-0001-5803-4316$$aMiguel, Antonio$$uUniversidad de Zaragoza
000084697 700__ $$0(orcid)0000-0001-9137-4013$$aLleida, Eduardo$$uUniversidad de Zaragoza
000084697 7102_ $$15008$$2800$$aUniversidad de Zaragoza$$bDpto. Ingeniería Electrón.Com.$$cÁrea Teoría Señal y Comunicac.
000084697 773__ $$g9, 18 (2019), 3697 [19 pp.]$$pAppl. sci.$$tApplied Sciences (Switzerland)$$x2076-3417
000084697 8564_ $$s437566$$uhttps://zaguan.unizar.es/record/84697/files/texto_completo.pdf$$yVersión publicada
000084697 8564_ $$s109124$$uhttps://zaguan.unizar.es/record/84697/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000084697 909CO $$ooai:zaguan.unizar.es:84697$$particulos$$pdriver
000084697 951__ $$a2023-09-13-10:47:12
000084697 980__ $$aARTICLE