000056313 001__ 56313
000056313 005__ 20201009175922.0
000056313 0247_ $$2doi$$a10.1016/j.csl.2016.03.001
000056313 0248_ $$2sideral$$a95519
000056313 037__ $$aART-2016-95519
000056313 041__ $$aeng
000056313 100__ $$aLopez-Moreno, Ignacio
000056313 245__ $$aOn the Use of Deep Feedforward Neural Networks for Automatic Language Identification
000056313 260__ $$c2016
000056313 5060_ $$aAccess copy available to the general public$$fUnrestricted
000056313 5203_ $$aIn this work, we present a comprehensive study on the use of deep neural networks (DNNs) for automatic language identification (LID). Motivated by the recent success of using DNNs in acoustic modeling for speech recognition, we adapt DNNs to the problem of identifying the language in a given utterance from its short-term acoustic features. We propose two different DNN- based approaches. In the first one, the DNN acts as an end-to-end LID classifier, receiving as input the speech features and providing as output the estimated probabilities of the target languages. In the second approach, the DNN is used to extract bottleneck features that are then used as inputs for a state-of-the-art i-vector system. Experiments are conducted in two different scenarios: the complete NIST Language Recognition Evaluation dataset 2009 (LRE’09) and a subset of the Voice of America (VOA) data from LRE’09, in which all languages have the same amount of training data. Results for both datasets demonstrate that the DNN-based systems significantly outperform a state-of-art i-vector system when dealing with short-duration utterances. Furthermore, the combination of the DNN-based and the classical i-vector system leads to additional performance improvements (up to 45% of relative improvement in both EER and Cavg on 3s and 10s conditions, respectively).
000056313 536__ $$9info:eu-repo/grantAgreement/ES/MINECO/TIN2011-28169-C05-02
000056313 540__ $$9info:eu-repo/semantics/openAccess$$aby$$uhttp://creativecommons.org/licenses/by/3.0/es/
000056313 590__ $$a1.9$$b2016
000056313 591__ $$aCOMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE$$b64 / 133 = 0.481$$c2016$$dQ2$$eT2
000056313 592__ $$a0.474$$b2016
000056313 593__ $$aHuman-Computer Interaction$$c2016$$dQ2
000056313 593__ $$aSoftware$$c2016$$dQ2
000056313 593__ $$aTheoretical Computer Science$$c2016$$dQ3
000056313 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000056313 700__ $$aGonzalez-Dominguez, Javier
000056313 700__ $$0(orcid)0000-0001-7593-1377$$aMartínez González, David
000056313 700__ $$aPlchot, Oldrich
000056313 700__ $$aGonzalez-Rodriguez, Joaquin
000056313 700__ $$aMoreno, Pedro
000056313 773__ $$g40 (2016), 46-59$$pComput. speech lang.$$tCOMPUTER SPEECH AND LANGUAGE$$x0885-2308
000056313 8564_ $$s1731078$$uhttps://zaguan.unizar.es/record/56313/files/texto_completo.pdf$$yVersión publicada
000056313 8564_ $$s76896$$uhttps://zaguan.unizar.es/record/56313/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000056313 909CO $$ooai:zaguan.unizar.es:56313$$particulos$$pdriver
000056313 951__ $$a2020-10-09-17:46:28
000056313 980__ $$aARTICLE