168101 20260126155509.0 doi 10.1007/s10462-025-11432-2 sideral 147679 ART-2026-147679 eng Mehavilla, Lorena Universidad de Zaragoza Evaluating large language models effectiveness for flow-based intrusion detection: a comparative study with ML and DL baselines 2026 This paper presents the first systematic benchmark evaluating Large Language Models (LLMs), specifically GPT-2, GPT-Neo-125M, and LLaMA-3.2-1B, as standalone classifiers for intrusion detection, covering both binary and multiclass classification tasks, using structured Zeek logs derived from the CIC IoT 2023 dataset. We compare their performance against established and widely used Machine Learning (XGBoost, Random Forest, Decision Tree) and Deep Learning models (MLP, GRU, LeNet-5) across key evaluation metrics: detection effectiveness (precision, recall and F1-score), inference speed, and resource consumption. All models are consistently trained and rigorously evaluated on the CIC IoT 2023 dataset, ensuring fair, reproducible, and transparent comparisons. Our findings indicate that while LLMs achieve strong F1-score exceeding 95%, and do not fully utilize available GPU resources, they still do not outperform top-performing ML models. Notably XGBoost achieves a higher F1-score of 96.96%, using only 4% of the available CPU. These results emphasize the practical trade-offs between detection capability, inference efficiency, and hardware requirements when applying LLMs in flow-based IDS contexts, particularly in resource-constrained environments such as IoT or edge deployments. Access copy available to the general public Unrestricted info:eu-repo/grantAgreement/ES/DGA/T31-20R info:eu-repo/grantAgreement/ES/MCINN/PID2022-136476OB-I00 info:eu-repo/semantics/openAccess by https://creativecommons.org/licenses/by/4.0/deed.es info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Rodríguez, María Universidad de Zaragoza García, José Universidad de Zaragoza (orcid)0000-0001-9485-7678 Alesanco, Álvaro Universidad de Zaragoza (orcid)0000-0002-5254-1402 5008 560 Universidad de Zaragoza Dpto. Ingeniería Electrón.Com. Área Ingeniería Telemática 59, 2 (2026), [38 pp.] Artif. intell. rev. ARTIFICIAL INTELLIGENCE REVIEW 0269-2821 2958161 http://zaguan.unizar.es/record/168101/files/texto_completo.pdf Versión publicada 1265242 http://zaguan.unizar.es/record/168101/files/texto_completo.jpg?subformat=icon icon Versión publicada oai:zaguan.unizar.es:168101 articulos driver 2026-01-26-14:50:32 ARTICLE