000077082 001__ 77082 000077082 005__ 20210121114523.0 000077082 0247_ $$2doi$$a10.1007/978-3-319-27030-2_15 000077082 0248_ $$2sideral$$a109659 000077082 037__ $$aART-2015-109659 000077082 041__ $$aeng 000077082 100__ $$0(orcid)0000-0001-5600-0008$$aGarrido, Ángel Luis 000077082 245__ $$aThe GENIE System: classifying documents by combining mixed-techniques 000077082 260__ $$c2015 000077082 5060_ $$aAccess copy available to the general public$$fUnrestricted 000077082 5203_ $$aToday, the automatic text classification is still an open problem and its implementation in companies and organizations with large volumes of data in text format is not a trivial matter. To achieve optimum results many parameters come into play, such as the language, the context, the level of knowledge of the issues discussed, the format of the documents, or the type of language that has been used in the documents to be classified. In this paper we describe a multi-language rule-based pipeline system, called GENIE, used for automatic document categorisation. We have used several business corpora in order to test the real capabilities of our proposal, and we have studied the results of applying different stages of the pipeline over the same data to test the influence of each step in the categorization process. The results obtained by this system are very promising, and in fact, the GENIE system is already being used on real production environments with very good results. 000077082 536__ $$9info:eu-repo/grantAgreement/ES/MINECO/TIN2013-46238-C4-4-R 000077082 540__ $$9info:eu-repo/semantics/openAccess$$aAll rights reserved$$uhttp://www.europeana.eu/rights/rr-f/ 000077082 592__ $$a0.284$$b2015 000077082 593__ $$aBusiness and International Management$$c2015$$dQ2 000077082 593__ $$aModeling and Simulation$$c2015$$dQ3 000077082 593__ $$aInformation Systems and Management$$c2015$$dQ3 000077082 593__ $$aManagement Information Systems$$c2015$$dQ3 000077082 593__ $$aControl and Systems Engineering$$c2015$$dQ3 000077082 593__ $$aInformation Systems$$c2015$$dQ3 000077082 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/acceptedVersion 000077082 700__ $$aGranados-Buey, María 000077082 700__ $$aEscudero, Sandra 000077082 700__ $$aPeiró, Álvaro 000077082 700__ $$0(orcid)0000-0002-7073-219X$$aIlarri, Sergio$$uUniversidad de Zaragoza 000077082 700__ $$0(orcid)0000-0002-7462-0080$$aMena, Eduardo$$uUniversidad de Zaragoza 000077082 7102_ $$15007$$2570$$aUniversidad de Zaragoza$$bDpto. Informát.Ingenie.Sistms.$$cÁrea Lenguajes y Sistemas Inf. 000077082 773__ $$g226 (2015), 231-246$$pLect. notes bus. inf. process.$$tLecture Notes in Business Information Processing$$x1865-1348 000077082 8564_ $$s543615$$uhttps://zaguan.unizar.es/record/77082/files/texto_completo.pdf$$yPostprint 000077082 8564_ $$s66669$$uhttps://zaguan.unizar.es/record/77082/files/texto_completo.jpg?subformat=icon$$xicon$$yPostprint 000077082 909CO $$ooai:zaguan.unizar.es:77082$$particulos$$pdriver 000077082 951__ $$a2021-01-21-11:04:25 000077082 980__ $$aARTICLE