000162315 001__ 162315
000162315 005__ 20251017144612.0
000162315 0247_ $$2doi$$a10.1109/TRO.2025.3582836
000162315 0248_ $$2sideral$$a144882
000162315 037__ $$aART-2025-144882
000162315 041__ $$aeng
000162315 100__ $$0(orcid)0000-0001-9671-4056$$aSebastián, Eduardo$$uUniversidad de Zaragoza
000162315 245__ $$aPhysics-Informed Multiagent Reinforcement Learning for Distributed Multirobot Problems
000162315 260__ $$c2025
000162315 5060_ $$aAccess copy available to the general public$$fUnrestricted
000162315 5203_ $$aThe networked nature of multirobot systems presents challenges in the context of multiagent reinforcement learning. Centralized control policies do not scale with increasing numbers of robots, whereas independent control policies do not exploit the information provided by other robots, exhibiting poor performance in cooperative-competitive tasks. In this work, we propose a physics-informed reinforcement learning approach able to learn distributed multirobot control policies that are both scalable and make use of all the available information to each robot. Our approach has three key characteristics. First, it imposes a port-Hamiltonian structure on the policy representation, respecting energy conservation properties of physical robot systems and the networked nature of robot team interactions. Second, it uses self-attention to ensure a sparse policy representation able to handle time-varying information at each robot from the interaction graph. Third, we present a soft actor–critic reinforcement learning algorithm parameterized by our self-attention port-Hamiltonian control policy, which accounts for the correlation among robots during training while overcoming the need of value function factorization. Extensive simulations in different multirobot scenarios demonstrate the success of the proposed approach, surpassing previous multirobot reinforcement learning solutions in scalability, while achieving similar or superior performance (with averaged cumulative reward up to ×2 greater than the state-of-the-art with robot teams ×6 larger than the number of robots at training time). We also validate our approach on multiple real robots in the Georgia Tech Robotarium under imperfect communication, demonstrating zero-shot sim-to-real transfer and scalability across number of robots.
000162315 536__ $$9info:eu-repo/grantAgreement/EUR/AEI/TED2021-130224B-I00$$9info:eu-repo/grantAgreement/ES/DGA/T45-23R$$9info:eu-repo/grantAgreement/ES/MCIU/FPU19-05700$$9info:eu-repo/grantAgreement/ES/MICINN/PID2021-125514NB-I00
000162315 540__ $$9info:eu-repo/semantics/openAccess$$aby$$uhttps://creativecommons.org/licenses/by/4.0/deed.es
000162315 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000162315 700__ $$aDuong, Thai
000162315 700__ $$aAtanasov, Nikolay
000162315 700__ $$0(orcid)0000-0002-5176-3767$$aMontijano, Eduardo$$uUniversidad de Zaragoza
000162315 700__ $$0(orcid)0000-0002-3032-954X$$aSagüés, Carlos$$uUniversidad de Zaragoza
000162315 7102_ $$15007$$2520$$aUniversidad de Zaragoza$$bDpto. Informát.Ingenie.Sistms.$$cÁrea Ingen.Sistemas y Automát.
000162315 773__ $$g41 (2025), 4499-4517$$pIEEE Trans. Robot.$$tIEEE Transactions on Robotics$$x1552-3098
000162315 787__ $$tThe supplementary video is a supporting document to the article$$w10.1109/TRO.2025.3582836/mm1
000162315 8564_ $$s10604245$$uhttps://zaguan.unizar.es/record/162315/files/texto_completo.pdf$$yVersión publicada
000162315 8564_ $$s3651467$$uhttps://zaguan.unizar.es/record/162315/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000162315 909CO $$ooai:zaguan.unizar.es:162315$$particulos$$pdriver
000162315 951__ $$a2025-10-17-14:17:57
000162315 980__ $$aARTICLE