000145414 001__ 145414 000145414 005__ 20241024135331.0 000145414 0247_ $$2doi$$a10.3390/electronics12132814 000145414 0248_ $$2sideral$$a140129 000145414 037__ $$aART-2023-140129 000145414 041__ $$aeng 000145414 100__ $$aCurtò, J. de 000145414 245__ $$aLLM-Informed Multi-Armed Bandit Strategies for Non-Stationary Environments 000145414 260__ $$c2023 000145414 5060_ $$aAccess copy available to the general public$$fUnrestricted 000145414 5203_ $$aIn this paper, we introduce an innovative approach to handling the multi-armed bandit (MAB) problem in non-stationary environments, harnessing the predictive power of large language models (LLMs). With the realization that traditional bandit strategies, including epsilon-greedy and upper confidence bound (UCB), may struggle in the face of dynamic changes, we propose a strategy informed by LLMs that offers dynamic guidance on exploration versus exploitation, contingent on the current state of the bandits. We bring forward a new non-stationary bandit model with fluctuating reward distributions and illustrate how LLMs can be employed to guide the choice of bandit amid this variability. Experimental outcomes illustrate the potential of our LLM-informed strategy, demonstrating its adaptability to the fluctuating nature of the bandit problem, while maintaining competitive performance against conventional strategies. This study provides key insights into the capabilities of LLMs in enhancing decision-making processes in dynamic and uncertain scenarios. 000145414 536__ $$9info:eu-repo/grantAgreement/ES/MCIN/AEI/PID2021-122580NB-I00 000145414 540__ $$9info:eu-repo/semantics/openAccess$$aby$$uhttp://creativecommons.org/licenses/by/3.0/es/ 000145414 590__ $$a2.6$$b2023 000145414 591__ $$aCOMPUTER SCIENCE, INFORMATION SYSTEMS$$b115 / 249 = 0.462$$c2023$$dQ2$$eT2 000145414 591__ $$aPHYSICS, APPLIED$$b81 / 179 = 0.453$$c2023$$dQ2$$eT2 000145414 591__ $$aENGINEERING, ELECTRICAL & ELECTRONIC$$b157 / 352 = 0.446$$c2023$$dQ2$$eT2 000145414 592__ $$a0.644$$b2023 000145414 593__ $$aComputer Networks and Communications$$c2023$$dQ2 000145414 593__ $$aControl and Systems Engineering$$c2023$$dQ2 000145414 593__ $$aSignal Processing$$c2023$$dQ2 000145414 593__ $$aHardware and Architecture$$c2023$$dQ2 000145414 593__ $$aElectrical and Electronic Engineering$$c2023$$dQ2 000145414 594__ $$a5.3$$b2023 000145414 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion 000145414 700__ $$0(orcid)0000-0002-5844-7871$$aZarzà, I. de 000145414 700__ $$aRoig, Gemma 000145414 700__ $$aCano, Juan Carlos 000145414 700__ $$aManzoni, Pietro 000145414 700__ $$aCalafate, Carlos T. 000145414 773__ $$g12, 13 (2023), 2814 [22 pp.]$$pElectronics (Basel)$$tElectronics$$x2079-9292 000145414 8564_ $$s687410$$uhttps://zaguan.unizar.es/record/145414/files/texto_completo.pdf$$yVersión publicada 000145414 8564_ $$s2517150$$uhttps://zaguan.unizar.es/record/145414/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada 000145414 909CO $$ooai:zaguan.unizar.es:145414$$particulos$$pdriver 000145414 951__ $$a2024-10-24-12:12:33 000145414 980__ $$aARTICLE