LLM-Informed Multi-Armed Bandit Strategies for Non-Stationary Environments

Curtò, J. de; Cano, Juan Carlos; Zarzà, I. de; Manzoni, Pietro; Roig, Gemma; Calafate, Carlos T.

doi:10.3390/electronics12132814

LLM-Informed Multi-Armed Bandit Strategies for Non-Stationary Environments

Curtò, J. de ; Zarzà, I. de ; Roig, Gemma ; Cano, Juan Carlos ; Manzoni, Pietro ; Calafate, Carlos T.

Resumen: In this paper, we introduce an innovative approach to handling the multi-armed bandit (MAB) problem in non-stationary environments, harnessing the predictive power of large language models (LLMs). With the realization that traditional bandit strategies, including epsilon-greedy and upper confidence bound (UCB), may struggle in the face of dynamic changes, we propose a strategy informed by LLMs that offers dynamic guidance on exploration versus exploitation, contingent on the current state of the bandits. We bring forward a new non-stationary bandit model with fluctuating reward distributions and illustrate how LLMs can be employed to guide the choice of bandit amid this variability. Experimental outcomes illustrate the potential of our LLM-informed strategy, demonstrating its adaptability to the fluctuating nature of the bandit problem, while maintaining competitive performance against conventional strategies. This study provides key insights into the capabilities of LLMs in enhancing decision-making processes in dynamic and uncertain scenarios.
Idioma: Inglés
DOI: 10.3390/electronics12132814
Año: 2023
Publicado en: Electronics 12, 13 (2023), 2814 [22 pp.]
ISSN: 2079-9292
Factor impacto JCR: 2.6 (2023)
Categ. JCR: COMPUTER SCIENCE, INFORMATION SYSTEMS rank: 115 / 250 = 0.46 (2023) - Q2 - T2
Categ. JCR: PHYSICS, APPLIED rank: 81 / 179 = 0.453 (2023) - Q2 - T2
Categ. JCR: ENGINEERING, ELECTRICAL & ELECTRONIC rank: 157 / 353 = 0.445 (2023) - Q2 - T2
Factor impacto CITESCORE: 5.3 - Electrical and Electronic Engineering (Q2) - Hardware and Architecture (Q2) - Signal Processing (Q2) - Computer Networks and Communications (Q2) - Control and Systems Engineering (Q2)

Factor impacto SCIMAGO: 0.644 - Computer Networks and Communications (Q2) - Control and Systems Engineering (Q2) - Signal Processing (Q2) - Hardware and Architecture (Q2) - Electrical and Electronic Engineering (Q2)

Financiación: info:eu-repo/grantAgreement/ES/MCINN/AEI/PID2021-122580NB-I00
Tipo y forma: Artículo (Versión definitiva)

Debe reconocer adecuadamente la autoría, proporcionar un enlace a la licencia e indicar si se han realizado cambios. Puede hacerlo de cualquier manera razonable, pero no de una manera que sugiera que tiene el apoyo del licenciador o lo recibe por el uso que hace.

Exportado de SIDERAL (2025-03-07-09:38:56)

Enlace permanente:

Visitas y descargas

Este artículo se encuentra en las siguientes colecciones:
Artículos

Volver a la búsqueda

Registro creado el 2024-10-24, última modificación el 2025-03-07

Versión publicada:
PDF

Valore este documento:

(Sin ninguna reseña)

Añadir a una carpeta personal
Exportar como BibTeX, MARC, MARCXML, DC, EndNote, NLM, RefWorks

Repositorio Institucional de Documentos

LLM-Informed Multi-Armed Bandit Strategies for Non-Stationary Environments