Open-Vocabulary Online Semantic Mapping for SLAM
Resumen: This letter presents an Open-Vocabulary Online 3D semantic mapping pipeline, that we denote by its acronym OVO. Given a sequence of posed RGB-D frames, we detect and track 3D segments, which we describe using CLIP vectors. These are computed from the viewpoints where they are observed by a novel CLIP merging method. Notably, our OVO has a significantly lower computational and memory footprint than offline baselines, while also showing better segmentation metrics than offline and online ones. Along with superior segmentation performance, we also show experimental results of our mapping contributions integrated with two different full SLAM backbones (Gaussian-SLAM and ORB-SLAM2), being the first ones using a neural network to merge CLIP descriptors and demonstrating end-to-end open-vocabulary online 3D mapping with loop closure.
Idioma: Inglés
DOI: 10.1109/LRA.2025.3617736
Año: 2025
Publicado en: IEEE Robotics and Automation Letters 10, 11 (2025), 11745-11752
ISSN: 2377-3766

Financiación: info:eu-repo/grantAgreement/ES/AEI/PID2024-155886NB-I00
Financiación: info:eu-repo/grantAgreement/ES/DGA/T45-23R
Financiación: info:eu-repo/grantAgreement/ES/MICINN/PID2021-127685NB-I00
Tipo y forma: Article (Published version)
Área (Departamento): Área Ingen.Sistemas y Automát. (Dpto. Informát.Ingenie.Sistms.)

Creative Commons You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.


Exportado de SIDERAL (2026-01-12-11:10:48)


Visitas y descargas

Este artículo se encuentra en las siguientes colecciones:
Articles > Artículos por área > Ingeniería de Sistemas y Automática



 Record created 2026-01-12, last modified 2026-01-12


Versión publicada:
 PDF
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)