Efficient tool segmentation for endoscopic videos in the wild

Tomasini, Clara; Riazuelo, Luis; Murillo, A.C.; Alonso, Iñigo

000112099 001__ 112099
000112099 005__ 20240705134135.0
000112099 0248_ $$2sideral$$a128003
000112099 037__ $$aART-2022-128003
000112099 041__ $$aeng
000112099 100__ $$0(orcid)0009-0001-2112-0939$$aTomasini, Clara$$uUniversidad de Zaragoza
000112099 245__ $$aEfficient tool segmentation for endoscopic videos in the wild
000112099 260__ $$c2022
000112099 5060_ $$aAccess copy available to the general public$$fUnrestricted
000112099 5203_ $$aIn recent years, deep learning methods have become the most effective approach for tool segmentation in endoscopic images, achieving the state of the art on the available public benchmarks. However, these methods present some challenges that hinder their direct deployment in real world scenarios. This work explores how to solve two of the most common challenges: real-time and memory restrictions and false positives in frames with no tools. To cope with the first case, we show how to adapt an efficient general purpose semantic segmentation model. Then, we study how to cope with the common issue of only training on images with at least one tool. Then, when images of endoscopic procedures without tools are processed, there are a lot of false positives. To solve this, we propose to add an extra classification head that performs binary frame classification, to identify frames with no tools present. Finally, we present a thorough comparison of this approach with current state of the art on different benchmarks, including real medical practice recordings, demonstrating similar accuracy with much lower computational requirements.

En los últimos años, los métodos de aprendizaje profundo se han convertido en el enfoque más efectivo para la segmentación de herramientas en imágenes endoscópicas, alcanzando el estado del arte en los puntos de referencia públicos disponibles. Sin embargo, estos métodos presentan algunos desafíos que dificultan su implementación directa en escenarios del mundo real. Este trabajo explora cómo resolver dos de los desafíos más comunes: restricciones de memoria y tiempo real y falsos positivos en marcos sin herramientas. Para hacer frente al primer caso, mostramos cómo adaptar un modelo eficiente de segmentación semántica de propósito general. Luego, estudiamos cómo lidiar con el problema común de solo entrenar en imágenes con al menos una herramienta. Entonces, cuando se procesan imágenes de procedimientos endoscópicos sin herramientas, hay muchos falsos positivos. Para resolver esto, Proponemos agregar un cabezal de clasificación adicional que realice la clasificación de marcos binarios, para identificar marcos sin herramientas presentes. Finalmente, presentamos una comparación exhaustiva de este enfoque con el estado actual del arte en diferentes puntos de referencia, incluidas las grabaciones de prácticas médicas reales, que demuestran una precisión similar con requisitos computacionales mucho más bajos.
000112099 540__ $$9info:eu-repo/semantics/openAccess$$aAll rights reserved$$uhttp://www.europeana.eu/rights/rr-f/
000112099 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000112099 700__ $$aAlonso, Iñigo
000112099 700__ $$0(orcid)0000-0002-6722-5541$$aRiazuelo, Luis$$uUniversidad de Zaragoza
000112099 700__ $$0(orcid)0000-0002-7580-9037$$aMurillo, A.C.$$uUniversidad de Zaragoza
000112099 7102_ $$15007$$2520$$aUniversidad de Zaragoza$$bDpto. Informát.Ingenie.Sistms.$$cÁrea Ingen.Sistemas y Automát.
000112099 773__ $$g2022 (2022), [17 pp.]$$pProc. Mach. Learn. Res.$$tProceedings of Machine Learning Research$$x2640-3498
000112099 85641 $$uhttps://openreview.net/forum?id=DPkb7gxt6gZ$$zTexto completo de la revista
000112099 8564_ $$s1178934$$uhttps://zaguan.unizar.es/record/112099/files/texto_completo.pdf$$yVersión publicada
000112099 8564_ $$s2222184$$uhttps://zaguan.unizar.es/record/112099/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000112099 909CO $$ooai:zaguan.unizar.es:112099$$particulos$$pdriver
000112099 951__ $$a2024-07-05-12:45:34
000112099 980__ $$aARTICLE

Repositorio Institucional de Documentos