112208 20220510091957.0 TAZ-TFM-2022-026 eng Berriel Martins, Tomás Learning disentangled representations of scenes from images. Aprendizaje de representaciones desenredadas de escenas a partir de imágenes. Zaragoza Universidad de Zaragoza 2022 by-nc-sa Creative Commons 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ Artificial intelligence is at the forefront of a technological revolution, in particular as a key component to build autonomous agents. However, not only training such agents come at a great computational cost, but they also end up lacking human basic abilities like generalization, information extrapolation, knowledge transfer between contexts, or improvisation. To overcome current limitations, agents need a deeper understanding of their environment, and more efficiently learning it from data. <br />There are very recent works that propose novel approaches to learn representations of the world: instead of learning invariant object encodings, they learn to isolate, or disentangle, the different variable properties which form an object. This would not only enable agents to understand object changes as modifications of one of their properties, but also to transfer such knowledge on the properties between different categories. This Master Thesis aims to develop a new machine learning model for disentangling object properties on monocular images of scenes. Our model is based on a state-of-the-art architecture for disentangled representations learning, and our goal is to reduce the computational complexity of the base model while also improving its performance. To achieve this, we will replace a recursive unsupervised segmentation network by an encoder-decoder segmentation network. Furthermore, before training such overparametrized neural model without supervision, we will profit from transfer learning of pre-trained weights from a supervised segmentation task. After developing a first vanilla model, we have tuned it to improve its performance and generalization capability. Then, an experimental validation has been performed on two commonly used synthetic datasets, evaluating both its disentanglement performance and computational efficiency, and on a more realistic dataset to analyze the model capability on real data. The results show that our model outperforms the state of the art, while reducing its computational footprint. Nevertheless, further research is needed to bridge the gap with real world applications.<br /> Máster Universitario en Robótica, Gráficos y Visión por Computador Derechos regulados por licencia Creative Commons Civera Sancho, Javier dir. Universidad de Zaragoza Informática e Ingeniería de Sistemas Ingeniería de Sistemas y Automática 718756@unizar.es 19046623 http://zaguan.unizar.es/record/112208/files/TAZ-TFM-2022-026.pdf Memoria (eng) oai:zaguan.unizar.es:112208 driver trabajos-fin-master TAZ TFM EINA 20220128171613.CREATION_DATE deposita:2022-05-10