000110559 001__ 110559
000110559 005__ 20240319080948.0
000110559 0247_ $$2doi$$a10.3390/ijgi11020087
000110559 0248_ $$2sideral$$a127489
000110559 037__ $$aART-2022-127489
000110559 041__ $$aeng
000110559 100__ $$aLacasta, J.
000110559 245__ $$aApproaches for the clustering of geographic metadata and the automatic detection of quasi-spatial dataset series
000110559 260__ $$c2022
000110559 5060_ $$aAccess copy available to the general public$$fUnrestricted
000110559 5203_ $$aThe discrete representation of resources in geospatial catalogues affects their information retrieval performance. The performance could be improved by using automatically generated clusters of related resources, which we name quasi-spatial dataset series. This work evaluates whether a clustering process can create quasi-spatial dataset series using only textual information from metadata elements. We assess the combination of different kinds of text cleaning approaches, word and sentence-embeddings representations (Word2Vec, GloVe, FastText, ELMo, Sentence BERT, and Universal Sentence Encoder), and clustering techniques (K-Means, DBSCAN, OPTICS, and agglomerative clustering) for the task. The results demonstrate that combining word-embeddings representations with an agglomerative-based clustering creates better quasi-spatial dataset series than the other approaches. In addition, we have found that the ELMo representation with agglomerative clustering produces good results without any preprocessing step for text cleaning.
000110559 536__ $$9info:eu-repo/grantAgreement/ES/AEI/PID2020-113353RB-I00$$9info:eu-repo/grantAgreement/ES/DGA/T59-20R
000110559 540__ $$9info:eu-repo/semantics/openAccess$$aby$$uhttp://creativecommons.org/licenses/by/3.0/es/
000110559 590__ $$a3.4$$b2022
000110559 592__ $$a0.738$$b2022
000110559 591__ $$aGEOGRAPHY, PHYSICAL$$b18 / 49 = 0.367$$c2022$$dQ2$$eT2
000110559 593__ $$aEarth and Planetary Sciences (miscellaneous)$$c2022$$dQ1
000110559 591__ $$aCOMPUTER SCIENCE, INFORMATION SYSTEMS$$b88 / 158 = 0.557$$c2022$$dQ3$$eT2
000110559 593__ $$aGeography, Planning and Development$$c2022$$dQ1
000110559 591__ $$aREMOTE SENSING$$b21 / 34 = 0.618$$c2022$$dQ3$$eT2
000110559 593__ $$aComputers in Earth Sciences$$c2022$$dQ2
000110559 594__ $$a6.2$$b2022
000110559 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000110559 700__ $$aLópez Pellicer, F. J.
000110559 700__ $$aZarazaga-Soria, J.
000110559 700__ $$aBéjar, R.
000110559 700__ $$aNogueras-Iso, J.
000110559 773__ $$g11, 2 (2022), 87[19 pp.]$$pISPRS int. j. geo-inf.$$tISPRS International Journal of Geo-Information$$x2220-9964
000110559 8564_ $$s543908$$uhttps://zaguan.unizar.es/record/110559/files/texto_completo.pdf$$yVersión publicada
000110559 8564_ $$s2860990$$uhttps://zaguan.unizar.es/record/110559/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000110559 909CO $$ooai:zaguan.unizar.es:110559$$particulos$$pdriver
000110559 951__ $$a2024-03-18-12:48:53
000110559 980__ $$aARTICLE