<?xml version="1.0" encoding="UTF-8"?>
<collection>
<dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:invenio="http://invenio-software.org/elements/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><dc:identifier>doi:10.1109/TASLP.2022.3224282</dc:identifier><dc:language>eng</dc:language><dc:creator>Diaz-Guerra, David</dc:creator><dc:creator>Miguel, Antonio</dc:creator><dc:creator>Beltran, Jose R.</dc:creator><dc:title>Direction of arrival estimation of sound sources using icosahedral CNNs</dc:title><dc:identifier>ART-2023-131864</dc:identifier><dc:description>In this paper, we present a new model for Direction of Arrival (DOA) estimation of sound sources based on an Icosahedral Convolutional Neural Network (CNN) applied over SRP-PHAT power maps computed from the signals received by a microphone array. This icosahedral CNN is equivariant to the 60 rotational symmetries of the icosahedron, which represent a good approximation of the continuous space of spherical rotations, and can be implemented using standard 2D convolutional layers, having a lower computational cost than most of the spherical CNNs. In addition, instead of using fully connected layers after the icosahedral convolutions, we propose a new soft-argmax function that can be seen as a differentiable version of the argmax function and allows us to solve the DOA estimation as a regression problem interpreting the output of the convolutional layers as a probability distribution. We prove that using models that fit the equivariances of the problem allows us to outperform other state-of-the-art models with a lower computational cost and more robustness, obtaining root mean square localization errors lower than 10∘ even in scenarios with a reverberation time T60 of 1.5s .</dc:description><dc:date>2023</dc:date><dc:source>http://zaguan.unizar.es/record/121235</dc:source><dc:doi>10.1109/TASLP.2022.3224282</dc:doi><dc:identifier>http://zaguan.unizar.es/record/121235</dc:identifier><dc:identifier>oai:zaguan.unizar.es:121235</dc:identifier><dc:relation>info:eu-repo/grantAgreement/ES/DGA-FEDER/2014-2020</dc:relation><dc:identifier.citation>IEEE/ACM Transactions on Audio, Speech, and Language Processing 31 (2023), 313-321</dc:identifier.citation><dc:rights>All rights reserved</dc:rights><dc:rights>http://www.europeana.eu/rights/rr-f/</dc:rights><dc:rights>info:eu-repo/semantics/closedAccess</dc:rights></dc:dc>

</collection>