000112203 001__ 112203
000112203 005__ 20220510091956.0
000112203 037__ $$aTAZ-TFM-2022-034
000112203 041__ $$aeng
000112203 1001_ $$aRoyo Meneses, Diego
000112203 24200 $$aOptical flow estimation in systems with co-located light and camera for monocular endoscopy.
000112203 24500 $$aEstimación de flujo óptico en sistemas con luz y cámara solidarios para endoscopia monocular.
000112203 260__ $$aZaragoza$$bUniversidad de Zaragoza$$c2022
000112203 506__ $$aby-nc-sa$$bCreative Commons$$c3.0$$uhttp://creativecommons.org/licenses/by-nc-sa/3.0/
000112203 520__ $$aThe European EndoMapper project aims to develop real-time mapping of the interior of the human body using only footage from an exploration procedure. This technology can enable novel medical operations that include robotized autonomous interaction and live augmented reality. All of this comes down to two principles that need to be overcome: first, generating a map of the human body, and then, being able to locate oneself within it. Simultaneous Localization And Mapping (SLAM) is a computer vision problem that tries to perform both tasks at the same time. For the case of an endoscopic procedure, the input to the SLAM algorithm is a monocular video sequence that is captured during exploration. Usually, Visual SLAM (VSLAM) algorithms are based on image matches i.e. they try to find parts of the environment that appear on two or more frames. In this Master Thesis, we explore and build on optical flow estimation, that is, algorithms that attempt to compute how each pixel moves between two images. Pixel motion is able to give dense correspondences for a video sequence. Most existing methods assume that the brightness of each pixel is constant regardless of where the camera is located. This is a good choice in most cases, for example, in outdoor scenes where diffuse objects are illuminated by the sun. In an endoscopy, the light and camera are co-located: changes in the position of the camera are correlated with changes of illumination. Moving the camera closer to an object makes it appear brighter. Our work explores two approaches to solve this problem. First, we develop a photometric model of light transport with a co-located light and camera. We introduce this model in an existing estimation algorithm and we are able to extract more precise image matches along with additional information from depth and surface normals. Secondly, we explore learning-based approaches. They have the great advantage of not requiring a hand-crafted illumination model, and their high-dimensional parameters are able to be learned using a large amount of training data. With the current technology, it is not possible to obtain enough ground truth optical flow in real endoscopy sequences. We explore different simulation environments and find that using a combination of real and synthetic data is key. With this, we obtain a 40% error reduction on optical flow estimation when evaluating on simulated data, and a 15% on captured data. Additionally, we show that mixing both types of training data produces much better qualitative results for other scene points whose ground truth is not available.<br />
000112203 521__ $$aMáster Universitario en Robótica, Gráficos y Visión por Computador
000112203 540__ $$aDerechos regulados por licencia Creative Commons
000112203 700__ $$aMartínez Montiel, José María$$edir.
000112203 7102_ $$aUniversidad de Zaragoza$$bInformática e Ingeniería de Sistemas$$cIngeniería de Sistemas y Automática
000112203 8560_ $$f740388@unizar.es
000112203 8564_ $$s18043326$$uhttps://zaguan.unizar.es/record/112203/files/TAZ-TFM-2022-034.pdf$$yMemoria (eng)
000112203 909CO $$ooai:zaguan.unizar.es:112203$$pdriver$$ptrabajos-fin-master
000112203 950__ $$a
000112203 951__ $$adeposita:2022-05-10
000112203 980__ $$aTAZ$$bTFM$$cEINA
000112203 999__ $$a20220128212622.CREATION_DATE