Bayesian optimization with adaptive kernels for robot control

Martinez-Cantin, R.
doi:10.1109/ICRA.2017.7989380
000131430 001__ 131430
000131430 005__ 20240209155915.0
000131430 0247_ $$2doi$$a10.1109/ICRA.2017.7989380
000131430 0248_ $$2sideral$$a101698
000131430 037__ $$aART-2017-101698
000131430 041__ $$aeng
000131430 100__ $$0(orcid)0000-0002-6741-844X$$aMartinez-Cantin, R.
000131430 245__ $$aBayesian optimization with adaptive kernels for robot control
000131430 260__ $$c2017
000131430 5060_ $$aAccess copy available to the general public$$fUnrestricted
000131430 5203_ $$aActive policy search combines the trial-and-error methodology from policy search with Bayesian optimization to actively find the optimal policy. First, policy search is a type of reinforcement learning which has become very popular for robot control, for its ability to deal with complex continuous state and action spaces. Second, Bayesian optimization is a sample efficient global optimization method that uses a surrogate model, like a Gaussian process, and optimal decision making to carefully select each sample during the optimization process. Sample efficiency is of paramount importance when each trial involves the real robot, expensive Monte Carlo runs, or a complex simulator. Black-box Bayesian optimization generally assumes a cost function from a stationary process, because nonstationary modeling is usually based on prior knowledge. However, many control problems are inherently nonstationary due to their failure conditions, terminal states and other abrupt effects. In this paper, we present a kernel function specially designed for Bayesian optimization, that allows nonstationary modeling without prior knowledge, using an adaptive local region. The new kernel results in an improved local search (exploitation), without penalizing the global search (exploration), as shown experimentally in well-known optimization benchmarks and robot control scenarios. We finally show its potential for the design of the wing shape of a UAV.
000131430 536__ $$9info:eu-repo/grantAgreement/ES/MINECO/DPI2015-65962-R$$9info:eu-repo/grantAgreement/ES/UZ/CUD2013-05$$9info:eu-repo/grantAgreement/ES/UZ/CUD2016-17
000131430 540__ $$9info:eu-repo/semantics/openAccess$$aAll rights reserved$$uhttp://www.europeana.eu/rights/rr-f/
000131430 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/acceptedVersion
000131430 773__ $$g17057742 (2017), 3350-3356$$pProc. - IEEE Int. Conf. Robot. Autom.$$tProceedings - IEEE International Conference on Robotics and Automation$$x1050-4729
000131430 8564_ $$s1435661$$uhttps://zaguan.unizar.es/record/131430/files/texto_completo.pdf$$yPostprint
000131430 8564_ $$s3433702$$uhttps://zaguan.unizar.es/record/131430/files/texto_completo.jpg?subformat=icon$$xicon$$yPostprint
000131430 909CO $$ooai:zaguan.unizar.es:131430$$particulos$$pdriver
000131430 951__ $$a2024-02-09-14:27:55
000131430 980__ $$aARTICLE
Repositorio Institucional de Documentos