Discrete variational calculus for accelerated optimization

M. Campos, Cédric; Mahillo, Alejandro; Martín de Diego, David
000126017 001__ 126017
000126017 005__ 20241125101151.0
000126017 0248_ $$2sideral$$a133512
000126017 037__ $$aART-2023-133512
000126017 041__ $$aeng
000126017 100__ $$aM. Campos, Cédric
000126017 245__ $$aDiscrete variational calculus for accelerated optimization
000126017 260__ $$c2023
000126017 5060_ $$aAccess copy available to the general public$$fUnrestricted
000126017 5203_ $$aMany of the new developments in machine learning are connected with gradient-based optimization methods. Recently, these methods have been studied using a variational perspective (Betancourt et al., 2018). This has opened up the possibility of introducing variational and symplectic methods using geometric integration. In particular, in this paper, we introduce variational integrators (Marsden and West, 2001) which allow us to derive different methods for optimization. Using both Hamilton’s and Lagrange-d’Alembert’s principle, we derive two families of optimization methods in one-to-one correspondence that generalize Polyak’s heavy ball (Polyak, 1964) and Nesterov’s accelerated gradient (Nesterov, 1983), the second of which mimics the behavior of the latter reducing the oscillations of classical momentum methods. However, since the systems considered are explicitly time-dependent, the preservation of symplecticity of autonomous systems occurs here solely on the fibers. Several experiments exemplify the result.
000126017 540__ $$9info:eu-repo/semantics/openAccess$$aby$$uhttp://creativecommons.org/licenses/by/3.0/es/
000126017 590__ $$a4.3$$b2023
000126017 592__ $$a2.796$$b2023
000126017 591__ $$aAUTOMATION & CONTROL SYSTEMS$$b21 / 84 = 0.25$$c2023$$dQ1$$eT1
000126017 593__ $$aArtificial Intelligence$$c2023$$dQ1
000126017 591__ $$aCOMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE$$b54 / 197 = 0.274$$c2023$$dQ2$$eT1
000126017 593__ $$aStatistics and Probability$$c2023$$dQ1
000126017 593__ $$aSoftware$$c2023$$dQ1
000126017 593__ $$aControl and Systems Engineering$$c2023$$dQ1
000126017 594__ $$a18.8$$b2023
000126017 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000126017 700__ $$0(orcid)0000-0003-4189-0268$$aMahillo, Alejandro$$uUniversidad de Zaragoza
000126017 700__ $$aMartín de Diego, David
000126017 7102_ $$12006$$2015$$aUniversidad de Zaragoza$$bDpto. Matemáticas$$cÁrea Análisis Matemático
000126017 773__ $$g24 (2023), 1-33$$pJ MACH LEARN RES$$tJOURNAL OF MACHINE LEARNING RESEARCH$$x1532-4435
000126017 85641 $$uhttps://jmlr.org/papers/volume24/21-1323/21-1323.pdf$$zTexto completo de la revista
000126017 8564_ $$s5225525$$uhttps://zaguan.unizar.es/record/126017/files/texto_completo.pdf$$yVersión publicada
000126017 8564_ $$s1728764$$uhttps://zaguan.unizar.es/record/126017/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000126017 909CO $$ooai:zaguan.unizar.es:126017$$particulos$$pdriver
000126017 951__ $$a2024-11-22-12:07:09
000126017 980__ $$aARTICLE
Repositorio Institucional de Documentos