Resumen: In this paper, we formulate the active SLAM paradigm in terms of model-free Deep Reinforcement Learning, embedding the traditional utility functions based on the Theory of Optimal Experimental Design in rewards, and therefore relaxing the intensive computations of classical approaches. We validate such formulation in a complex simulation environment, using a state-of-the-art deep Q-learning architecture with laser measurements as network inputs. Trained agents become capable not only to learn a policy to navigate and explore in the absence of an environment model but also to transfer their knowledge to previously unseen maps, which is a key requirement in robotic exploration. Idioma: Inglés DOI: 10.3390/app10238386 Año: 2020 Publicado en: APPLIED SCIENCES-BASEL 10, 23 (2020), 8386 [1-21] ISSN: 2076-3417 Factor impacto JCR: 2.679 (2020) Categ. JCR: PHYSICS, APPLIED rank: 73 / 160 = 0.456 (2020) - Q2 - T2 Categ. JCR: ENGINEERING, MULTIDISCIPLINARY rank: 38 / 91 = 0.418 (2020) - Q2 - T2 Categ. JCR: CHEMISTRY, MULTIDISCIPLINARY rank: 101 / 178 = 0.567 (2020) - Q3 - T2 Categ. JCR: MATERIALS SCIENCE, MULTIDISCIPLINARY rank: 201 / 333 = 0.604 (2020) - Q3 - T2 Factor impacto SCIMAGO: 0.435 - Computer Science Applications (Q2) - Engineering (miscellaneous) (Q2) - Process Chemistry and Technology (Q2) - Instrumentation (Q2) - Materials Science (miscellaneous) (Q2) - Fluid Flow and Transfer Processes (Q2)