SST-Sal: A spherical spatio-temporal approach for saliency prediction in 360 videos

Bernal Berdun, Edurne (Universidad de Zaragoza) ; Martín Serrano, Daniel (Universidad de Zaragoza) ; Gutiérrez Pérez, Diego (Universidad de Zaragoza) ; Masiá Corcoy, Belén (Universidad de Zaragoza)
SST-Sal: A spherical spatio-temporal approach for saliency prediction in 360 videos
Financiación H2020 / H2020 Funds
Resumen: Virtual reality (VR) has the potential to change the way people consume content, and has been predicted to become the next big computing paradigm. However, much remains unknown about the grammar and visual language of this new medium, and understanding and predicting how humans behave in virtual environments remains an open problem. In this work, we propose a novel saliency prediction model which exploits the joint potential of spherical convolutions and recurrent neural networks to extract and model the inherent spatio-temporal features from 360° videos. We employ Convolutional Long Short-Term Memory cells (ConvLSTMs) to account for temporal information at the time of feature extraction rather than to post-process spatial features as in previous works. To facilitate spatio-temporal learning, we provide the network with an estimation of the optical flow between 360° frames, since motion is known to be a highly salient feature in dynamic content. Our model is trained with a novel spherical Kullback–Leibler Divergence (KLDiv) loss function specifically tailored for saliency prediction in 360° content. Our approach outperforms previous state-of-the-art works, being able to mimic human visual attention when exploring dynamic 360° videos.
Idioma: Inglés
DOI: 10.1016/j.cag.2022.06.002
Año: 2022
Publicado en: COMPUTERS & GRAPHICS-UK 106 (2022), 200-209
ISSN: 0097-8493

Factor impacto JCR: 2.5 (2022)
Categ. JCR: COMPUTER SCIENCE, SOFTWARE ENGINEERING rank: 52 / 108 = 0.481 (2022) - Q2 - T2
Factor impacto CITESCORE: 4.9 - Computer Science (Q2) - Engineering (Q2)

Factor impacto SCIMAGO: 0.539 - Computer Graphics and Computer-Aided Design (Q2) - Computer Vision and Pattern Recognition (Q2) - Engineering (miscellaneous) (Q2) - Signal Processing (Q2) - Human-Computer Interaction (Q3) - Software (Q3)

Financiación: info:eu-repo/grantAgreement/ES/AEI/PID2019-105004GB-I00
Financiación: info:eu-repo/grantAgreement/EC/H2020/682080/EU/Intuitive editing of visual appearance from real-world datasets/CHAMELEON
Financiación: info:eu-repo/grantAgreement/EC/H2020/956585/EU/Predictive Rendering In Manufacture and Engineering/PRIME
Tipo y forma: Artículo (PostPrint)
Área (Departamento): Área Lenguajes y Sistemas Inf. (Dpto. Informát.Ingenie.Sistms.)

Creative Commons Debe reconocer adecuadamente la autoría, proporcionar un enlace a la licencia e indicar si se han realizado cambios. Puede hacerlo de cualquier manera razonable, pero no de una manera que sugiera que tiene el apoyo del licenciador o lo recibe por el uso que hace. No puede utilizar el material para una finalidad comercial. Si remezcla, transforma o crea a partir del material, no puede difundir el material modificado.


Exportado de SIDERAL (2025-10-17-14:28:27)


Visitas y descargas

Este artículo se encuentra en las siguientes colecciones:
Artículos > Artículos por área > Lenguajes y Sistemas Informáticos



 Registro creado el 2025-08-26, última modificación el 2025-10-17


Postprint:
 PDF
Valore este documento:

Rate this document:
1
2
3
 
(Sin ninguna reseña)