Overdriving Visual Depth Perception via Sound Modulation in VR
Financiación H2020 / H2020 Funds
Resumen: Our ability to perceive and navigate the spatial world is a cornerstone of human experience, relying on the integration of visual and auditory cues to form a coherent sense of depth and distance. In stereoscopic 3D vision, depth perception requires fixation of both eyes on a target object, which is achieved through vergence movements, with convergence for near objects and divergence for distant ones. In contrast, auditory cues provide complementary depth information through variations in loudness, interaural differences (IAD), and the frequency spectrum. We investigate the interaction between visual and auditory cues and examine how contradictory auditory information can overdrive visual depth perception in virtual reality (VR). When a new visual target appears, we introduce a spatial discrepancy between the visual and auditory cues: the visual target is shifted closer to the previously fixated object, while the corresponding sound localization is displaced in the opposite direction. By integrating these conflicting cues through multimodal processing, the resulting percept is biased toward the intended depth location. This audiovisual fusion counteracts depth compression, thus reducing the required vergence magnitude and enabling faster gaze retargeting. Such audio-driven depth enhancement may further help mitigate the vergence-accommodation conflict (VAC) in scenarios where physical depth must be compressed. In a series of psychophysical studies, we first assess the efficiency of depth overdriving for various VR-relevant combinations of initial fixations and shifted target locations, considering different scenarios of audio displacements and their loudness and frequency parameters. Next, we quantify the resulting speedup in gaze retargeting for target shifts that can be successfully overdriven by sound manipulations. Finally, we apply our method in a naturalistic VR scenario where user interface interactions with the scene show an extended perceptual depth.
Idioma: Inglés
DOI: 10.1109/TVCG.2026.3679918
Año: 2026
Publicado en: IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS (2026), [11 pp.]
ISSN: 1077-2626

Financiación: info:eu-repo/grantAgreement/EC/H2020/101220555/EU/Predictive computational models for Adaptive Extended reality/PROXIE
Financiación: info:eu-repo/grantAgreement/ES/MICIU/PID2022-141766OB-I00
Tipo y forma: Artículo (Versión definitiva)
Área (Departamento): Área Lenguajes y Sistemas Inf. (Dpto. Informát.Ingenie.Sistms.)

Creative Commons Debe reconocer adecuadamente la autoría, proporcionar un enlace a la licencia e indicar si se han realizado cambios. Puede hacerlo de cualquier manera razonable, pero no de una manera que sugiera que tiene el apoyo del licenciador o lo recibe por el uso que hace.


Exportado de SIDERAL (2026-05-05-13:36:30)


Visitas y descargas

Este artículo se encuentra en las siguientes colecciones:
Artículos > Artículos por área > Lenguajes y Sistemas Informáticos



 Registro creado el 2026-05-05, última modificación el 2026-05-05


Versión publicada:
 PDF
Valore este documento:

Rate this document:
1
2
3
 
(Sin ninguna reseña)