000112208 001__ 112208
000112208 005__ 20220510091957.0
000112208 037__ $$aTAZ-TFM-2022-026
000112208 041__ $$aeng
000112208 1001_ $$aBerriel Martins, Tomás
000112208 24200 $$aLearning disentangled representations of scenes from images.
000112208 24500 $$aAprendizaje de representaciones desenredadas de escenas a partir de imágenes.
000112208 260__ $$aZaragoza$$bUniversidad de Zaragoza$$c2022
000112208 506__ $$aby-nc-sa$$bCreative Commons$$c3.0$$uhttp://creativecommons.org/licenses/by-nc-sa/3.0/
000112208 520__ $$aArtificial intelligence is at the forefront of a technological revolution, in particular as a key component to build autonomous agents. However, not only training such agents come at a great computational cost, but they also end up lacking human basic abilities like generalization, information extrapolation, knowledge transfer between contexts, or improvisation. To overcome current limitations, agents need a deeper understanding of their environment, and more efficiently learning it from data. <br />There are very recent works that propose novel approaches to learn representations of the world: instead of learning invariant object encodings, they learn to isolate, or disentangle, the different variable properties which form an object. This would not only enable agents to understand object changes as modifications of one of their properties, but also to transfer such knowledge on the properties between different categories. This Master Thesis aims to develop a new machine learning model for disentangling object properties on monocular images of scenes. Our model is based on a state-of-the-art architecture for disentangled representations learning, and our goal is to reduce the computational complexity of the base model while also improving its performance. To achieve this, we will replace a recursive unsupervised segmentation network by an encoder-decoder segmentation network. Furthermore, before training such overparametrized neural model without supervision, we will profit from transfer learning of pre-trained weights from a supervised segmentation task. After developing a first vanilla model, we have tuned it to improve its performance and generalization capability. Then, an experimental validation has been performed on two commonly used synthetic datasets, evaluating both its disentanglement performance and computational efficiency, and on a more realistic dataset to analyze the model capability on real data. The results show that our model outperforms the state of the art, while reducing its computational footprint. Nevertheless, further research is needed to bridge the gap with real world applications.<br />
000112208 521__ $$aMáster Universitario en Robótica, Gráficos y Visión por Computador
000112208 540__ $$aDerechos regulados por licencia Creative Commons
000112208 700__ $$aCivera Sancho, Javier$$edir.
000112208 7102_ $$aUniversidad de Zaragoza$$bInformática e Ingeniería de Sistemas$$cIngeniería de Sistemas y Automática
000112208 8560_ $$f718756@unizar.es
000112208 8564_ $$s19046623$$uhttps://zaguan.unizar.es/record/112208/files/TAZ-TFM-2022-026.pdf$$yMemoria (eng)
000112208 909CO $$ooai:zaguan.unizar.es:112208$$pdriver$$ptrabajos-fin-master
000112208 950__ $$a
000112208 951__ $$adeposita:2022-05-10
000112208 980__ $$aTAZ$$bTFM$$cEINA
000112208 999__ $$a20220128171613.CREATION_DATE
Repositorio Institucional de Documentos