000071053 001__ 71053 000071053 005__ 20190529115231.0 000071053 0247_ $$2doi$$a10.21437/Interspeech.2016-56 000071053 0248_ $$2sideral$$a106236 000071053 037__ $$aART-2016-106236 000071053 041__ $$aeng 000071053 100__ $$0(orcid)0000-0002-0261-3877$$aOlcoz, J.$$uUniversidad de Zaragoza 000071053 245__ $$aError correction in lightly supervised alignment of broadcast subtitles 000071053 260__ $$c2016 000071053 5060_ $$aAccess copy available to the general public$$fUnrestricted 000071053 5203_ $$aThis paper presents a range of error correction techniques aimed at improving the accuracy of a lightly supervised alignment task for broadcast subtitles. Lightly supervised approaches are frequently used in the multimedia domain, either for subtitling purposes or for providing a more reliable source for training speech based systems. The proposed methods focus on directly correcting of the alignment output using different techniques to infer word insertions and words with inaccurate time boundaries. The features used by the classification models are the outputs from the alignment system, such as confidence measures, and word or segment duration. Experiments in this paper are based on broadcast material provided by the BBC to the Multi Genre Broadcast (MGB) challenge participants. Results, show that the order alignment F measure improves up to 2.6% absolute (15.8% relative) when combining insertion and word boundary correction. 000071053 536__ $$9info:eu-repo/grantAgreement/ES/MINEC0/TIN2014-54288-C4-2-R$$9info:eu-repo/grantAgreement/ES/MINECO/BES-2012-056894$$9info:eu-repo/grantAgreement/EC/FP7/610986/EU/IRIS: Towards Natural Interaction and Communication/IRIS 000071053 540__ $$9info:eu-repo/semantics/openAccess$$aAll rights reserved$$uhttp://www.europeana.eu/rights/rr-f/ 000071053 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/acceptedVersion 000071053 700__ $$aSaz, O. 000071053 700__ $$aHain, T. 000071053 7102_ $$15008$$2800$$aUniversidad de Zaragoza$$bDpto. Ingeniería Electrón.Com.$$cÁrea Teoría Señal y Comunicac. 000071053 773__ $$g(2016), 2110-2114$$pInterspeech (USB)$$tInterspeech (USB)$$x2308-457X 000071053 85641 $$uhttps://www.isca-speech.org/archive/Interspeech_2016/pdfs/0056.PDF$$zTexto completo de la revista 000071053 8564_ $$s174017$$uhttps://zaguan.unizar.es/record/71053/files/texto_completo.pdf$$yPostprint 000071053 8564_ $$s124270$$uhttps://zaguan.unizar.es/record/71053/files/texto_completo.jpg?subformat=icon$$xicon$$yPostprint 000071053 909CO $$ooai:zaguan.unizar.es:71053$$particulos$$pdriver 000071053 951__ $$a2019-05-29-11:43:30 000071053 980__ $$aARTICLE