127930 20241125101144.0 doi 10.3390/app13169062 sideral 135096 ART-2023-135096 eng Pastor, Miguel A. Cross-corpus training strategy for speech emotion recognition using self-supervised representations 2023 Access copy available to the general public Unrestricted Speech Emotion Recognition (SER) plays a crucial role in applications involving human-machine interaction. However, the scarcity of suitable emotional speech datasets presents a major challenge for accurate SER systems. Deep Neural Network (DNN)-based solutions currently in use require substantial labelled data for successful training. Previous studies have proposed strategies to expand the training set in this framework by leveraging available emotion speech corpora. This paper assesses the impact of a cross-corpus training extension for a SER system using self-supervised (SS) representations, namely HuBERT and WavLM. The feasibility of training systems with just a few minutes of in-domain audio is also analyzed. The experimental results demonstrate that augmenting the training set with EmoDB (German), RAVDESS, and CREMA-D (English) datasets leads to improved SER accuracy on the IEMOCAP dataset. By combining a cross-corpus training extension and SS representations, state-of-the-art performance is achieved. These findings suggest that the cross-corpus strategy effectively addresses the scarcity of labelled data and enhances the performance of SER systems. info:eu-repo/grantAgreement/ES/AEI/PDC2021-120846-C41 info:eu-repo/grantAgreement/ES/AEI/PID2021-126061OB-C44 info:eu-repo/grantAgreement/ES/DGA/T36-20R info:eu-repo/grantAgreement/EC/H2020/101007666/EU/Exchanges for SPEech ReseArch aNd TechnOlogies/ESPERANTO This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No H2020 101007666-ESPERANTO info:eu-repo/semantics/openAccess by http://creativecommons.org/licenses/by/3.0/es/ 2.5 2023 ENGINEERING, MULTIDISCIPLINARY 44 / 181 = 0.243 2023 Q1 T1 PHYSICS, APPLIED 87 / 179 = 0.486 2023 Q2 T2 CHEMISTRY, MULTIDISCIPLINARY 115 / 231 = 0.498 2023 Q2 T2 MATERIALS SCIENCE, MULTIDISCIPLINARY 258 / 439 = 0.588 2023 Q3 T2 0.508 2023 Engineering (miscellaneous) 2023 Q2 Fluid Flow and Transfer Processes 2023 Q2 Materials Science (miscellaneous) 2023 Q2 Instrumentation 2023 Q2 Process Chemistry and Technology 2023 Q3 Computer Science Applications 2023 Q3 5.3 2023 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Ribas, Dayana Ortega, Alfonso Universidad de Zaragoza (orcid)0000-0002-3886-7748 Miguel, Antonio Universidad de Zaragoza (orcid)0000-0001-5803-4316 Lleida, Eduardo Universidad de Zaragoza (orcid)0000-0001-9137-4013 5008 800 Universidad de Zaragoza Dpto. Ingeniería Electrón.Com. Área Teoría Señal y Comunicac. 13, 16 (2023), 9062 [15 pp] Appl. sci. Applied Sciences (Switzerland) 2076-3417 1093005 http://zaguan.unizar.es/record/127930/files/texto_completo.pdf Versión publicada 2814488 http://zaguan.unizar.es/record/127930/files/texto_completo.jpg?subformat=icon icon Versión publicada oai:zaguan.unizar.es:127930 articulos driver 2024-11-22-12:03:52 ARTICLE