165793 20260114135812.0 doi 10.1109/TASLP.2020.3040031 sideral 121597 ART-2021-121597 eng Diaz-Guerra, D. Universidad de Zaragoza (orcid)0000-0002-1041-0498 Robust Sound Source Tracking Using SRP-PHAT and 3D Convolutional Neural Networks 2021 In this article, we present a new single sound source DOA estimation and tracking system based on the well-known SRP-PHAT algorithm and a three-dimensional Convolutional Neural Network. It uses SRP-PHAT power maps as input features of a fully convolutional causal architecture that uses 3D convolutional layers to accurately perform the tracking of a sound source even in highly reverberant scenarios where most of the state of the art techniques fail. Unlike previous methods, since we do not use bidirectional recurrent layers and all our convolutional layers are causal in the time dimension, our system is feasible for real-time applications and it provides a new DOA estimation for each new SRP-PHAT map. To train the model, we introduce a new procedure to simulate random trajectories as they are needed during the training, equivalent to an infinite-size dataset with high flexibility to modify its acoustical conditions such as the reverberation time. We use both acoustical simulations on a large range of reverberation times and the actual recordings of the LOCATA dataset to prove the robustness of our system and its good performance even using low-resolution SRP-PHAT maps. info:eu-repo/semantics/closedAccess All rights reserved http://www.europeana.eu/rights/rr-f/ 4.364 2021 ACOUSTICS 5 / 32 = 0.156 2021 Q1 T1 ENGINEERING, ELECTRICAL & ELECTRONIC 82 / 274 = 0.299 2021 Q2 T1 1.591 2021 Acoustics and Ultrasonics 2021 Q1 Computational Mathematics 2021 Q1 Speech and Hearing 2021 Q1 Instrumentation 2021 Q1 Media Technology 2021 Q1 Electrical and Electronic Engineering 2021 Q1 9.4 2021 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Miguel, A. Universidad de Zaragoza (orcid)0000-0001-5803-4316 Beltran, J.R. Universidad de Zaragoza (orcid)0000-0002-7500-4650 5008 785 Universidad de Zaragoza Dpto. Ingeniería Electrón.Com. Área Tecnología Electrónica 5008 800 Universidad de Zaragoza Dpto. Ingeniería Electrón.Com. Área Teoría Señal y Comunicac. 29 (2021), 300-311 IEEE/ACM trans. audio speech lang. process. IEEE/ACM Transactions on Audio, Speech, and Language Processing 2329-9290 2112487 http://zaguan.unizar.es/record/165793/files/texto_completo.pdf Versión publicada 3483261 http://zaguan.unizar.es/record/165793/files/texto_completo.jpg?subformat=icon icon Versión publicada oai:zaguan.unizar.es:165793 articulos driver 2026-01-14-12:45:51 ARTICLE