000108424 001__ 108424 000108424 005__ 20230519145502.0 000108424 0247_ $$2doi$$a10.3390/app11188521 000108424 0248_ $$2sideral$$a124900 000108424 037__ $$aART-2021-124900 000108424 041__ $$aeng 000108424 100__ $$0(orcid)0000-0001-9137-4013$$aViñals, Ignacio$$uUniversidad de Zaragoza 000108424 245__ $$aThe Domain Mismatch Problem in the Broadcast Speaker Attribution Task 000108424 260__ $$c2021 000108424 5060_ $$aAccess copy available to the general public$$fUnrestricted 000108424 5203_ $$aThe demand of high-quality metadata for the available multimedia content requires the development of new techniques able to correctly identify more and more information, including the speaker information. The task known as speaker attribution aims at identifying all or part of the speakers in the audio under analysis. In this work, we carry out a study of the speaker attribution problem in the broadcast domain. Through our experiments, we illustrate the positive impact of diarization on the final performance. Additionally, we show the influence of the variability present in broadcast data, depicting the broadcast domain as a collection of subdomains with particular characteristics. Taking these two factors into account, we also propose alternative approximations robust against domain mismatch. These approximations include a semisupervised alternative as well as a totally unsupervised new hybrid solution fusing diarization and speaker assignment. Thanks to these two approximations, our performance is boosted around a relative 50%. The analysis has been carried out using the corpus for the Albayzín 2020 challenge, a diarization and speaker attribution evaluation working with broadcast data. These data, provided by Radio Televisión Española (RTVE), the Spanish public Radio and TV Corporation, include multiple shows and genres to analyze the impact of new speech technologies in real-world scenarios. 000108424 536__ $$9info:eu-repo/grantAgreement/ES/DGA/T36-20R$$9info:eu-repo/grantAgreement/EC/H2020/101007666/EU/Exchanges for SPEech ReseArch aNd TechnOlogies/ESPERANTO$$9This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No H2020 101007666-ESPERANTO$$9info:eu-repo/grantAgreement/ES/MINECO/TIN2017-85854-C4-1-R 000108424 540__ $$9info:eu-repo/semantics/openAccess$$aby$$uhttp://creativecommons.org/licenses/by/3.0/es/ 000108424 590__ $$a2.838$$b2021 000108424 592__ $$a0.507$$b2021 000108424 594__ $$a3.7$$b2021 000108424 591__ $$aENGINEERING, MULTIDISCIPLINARY$$b39 / 92 = 0.424$$c2021$$dQ2$$eT2 000108424 591__ $$aPHYSICS, APPLIED$$b76 / 161 = 0.472$$c2021$$dQ2$$eT2 000108424 591__ $$aMATERIALS SCIENCE, MULTIDISCIPLINARY$$b218 / 345 = 0.632$$c2021$$dQ3$$eT2 000108424 591__ $$aCHEMISTRY, MULTIDISCIPLINARY$$b100 / 180 = 0.556$$c2021$$dQ3$$eT2 000108424 593__ $$aEngineering (miscellaneous)$$c2021$$dQ2 000108424 593__ $$aComputer Science Applications$$c2021$$dQ2 000108424 593__ $$aProcess Chemistry and Technology$$c2021$$dQ2 000108424 593__ $$aMaterials Science (miscellaneous)$$c2021$$dQ2 000108424 593__ $$aFluid Flow and Transfer Processes$$c2021$$dQ2 000108424 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion 000108424 700__ $$0(orcid)0000-0001-5803-4316$$aOrtega, Alfonso$$uUniversidad de Zaragoza 000108424 700__ $$0(orcid)0000-0002-3886-7748$$aMiguel, Antonio$$uUniversidad de Zaragoza 000108424 700__ $$0(orcid)0000-0003-1772-0605$$aLleida, Eduardo$$uUniversidad de Zaragoza 000108424 7102_ $$15008$$2800$$aUniversidad de Zaragoza$$bDpto. Ingeniería Electrón.Com.$$cÁrea Teoría Señal y Comunicac. 000108424 773__ $$g11, 18 (2021), 8521 [19 p.]$$pAppl. sci.$$tApplied Sciences (Switzerland)$$x2076-3417 000108424 8564_ $$s11068684$$uhttps://zaguan.unizar.es/record/108424/files/texto_completo.pdf$$yVersión publicada 000108424 8564_ $$s2696132$$uhttps://zaguan.unizar.es/record/108424/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada 000108424 909CO $$ooai:zaguan.unizar.es:108424$$particulos$$pdriver 000108424 951__ $$a2023-05-18-15:01:16 000108424 980__ $$aARTICLE