000145414 001__ 145414
000145414 005__ 20241024135331.0
000145414 0247_ $$2doi$$a10.3390/electronics12132814
000145414 0248_ $$2sideral$$a140129
000145414 037__ $$aART-2023-140129
000145414 041__ $$aeng
000145414 100__ $$aCurtò, J. de
000145414 245__ $$aLLM-Informed Multi-Armed Bandit Strategies for Non-Stationary Environments
000145414 260__ $$c2023
000145414 5060_ $$aAccess copy available to the general public$$fUnrestricted
000145414 5203_ $$aIn this paper, we introduce an innovative approach to handling the multi-armed bandit (MAB) problem in non-stationary environments, harnessing the predictive power of large language models (LLMs). With the realization that traditional bandit strategies, including epsilon-greedy and upper confidence bound (UCB), may struggle in the face of dynamic changes, we propose a strategy informed by LLMs that offers dynamic guidance on exploration versus exploitation, contingent on the current state of the bandits. We bring forward a new non-stationary bandit model with fluctuating reward distributions and illustrate how LLMs can be employed to guide the choice of bandit amid this variability. Experimental outcomes illustrate the potential of our LLM-informed strategy, demonstrating its adaptability to the fluctuating nature of the bandit problem, while maintaining competitive performance against conventional strategies. This study provides key insights into the capabilities of LLMs in enhancing decision-making processes in dynamic and uncertain scenarios.
000145414 536__ $$9info:eu-repo/grantAgreement/ES/MCIN/AEI/PID2021-122580NB-I00
000145414 540__ $$9info:eu-repo/semantics/openAccess$$aby$$uhttp://creativecommons.org/licenses/by/3.0/es/
000145414 590__ $$a2.6$$b2023
000145414 591__ $$aCOMPUTER SCIENCE, INFORMATION SYSTEMS$$b115 / 249 = 0.462$$c2023$$dQ2$$eT2
000145414 591__ $$aPHYSICS, APPLIED$$b81 / 179 = 0.453$$c2023$$dQ2$$eT2
000145414 591__ $$aENGINEERING, ELECTRICAL & ELECTRONIC$$b157 / 352 = 0.446$$c2023$$dQ2$$eT2
000145414 592__ $$a0.644$$b2023
000145414 593__ $$aComputer Networks and Communications$$c2023$$dQ2
000145414 593__ $$aControl and Systems Engineering$$c2023$$dQ2
000145414 593__ $$aSignal Processing$$c2023$$dQ2
000145414 593__ $$aHardware and Architecture$$c2023$$dQ2
000145414 593__ $$aElectrical and Electronic Engineering$$c2023$$dQ2
000145414 594__ $$a5.3$$b2023
000145414 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000145414 700__ $$0(orcid)0000-0002-5844-7871$$aZarzà, I. de
000145414 700__ $$aRoig, Gemma
000145414 700__ $$aCano, Juan Carlos
000145414 700__ $$aManzoni, Pietro
000145414 700__ $$aCalafate, Carlos T.
000145414 773__ $$g12, 13 (2023), 2814 [22 pp.]$$pElectronics (Basel)$$tElectronics$$x2079-9292
000145414 8564_ $$s687410$$uhttps://zaguan.unizar.es/record/145414/files/texto_completo.pdf$$yVersión publicada
000145414 8564_ $$s2517150$$uhttps://zaguan.unizar.es/record/145414/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000145414 909CO $$ooai:zaguan.unizar.es:145414$$particulos$$pdriver
000145414 951__ $$a2024-10-24-12:12:33
000145414 980__ $$aARTICLE