A Novel Dataset for Early Cardiovascular Risk Detection in School Children Using Machine Learning
Resumen: This study introduces the PROCDEC dataset, a novel collection of 1140 cases with 30 cardiovascular risk factors gathered over a 10-year period from school children in Santa Clara, Cuba. The dataset was curated with input from medical experts in pediatric cardiology, endocrinology, general medicine, and clinical laboratory, ensuring its clinical relevance. We conducted a rigorous performance evaluation of 10 machine learning (ML) algorithms to classify cardiovascular risk into two categories: at risk and not at risk. The models were assessed using a stratified k-fold cross-validation approach to enhance the reliability of the findings. Among the evaluated models—Bayes Net, Naive Bayes, SMO, K-Nearest Neighbors (KNN), Logistic Regression, AdaBoost, Multilayer Perceptron (MLP), J48, Logistic Model Tree (LMT), and Random Forest (RF)—the best-performing classifiers (MLP, LMT, J48 and Logistic Regression) achieved F1-score values exceeding 0.83, indicating strong predictive capability. To improve interpretability, we employed feature selection techniques to rank the most influential risk factors. Key contributors to classification performance included hypertension, hyperreactivity, body mass index (BMI), uric acid, cholesterol, parental hypertension, and sibling dyslipidemia. These findings align with established clinical knowledge and reinforce the potential of ML models for pediatric cardiovascular risk assessment. Unlike previous studies, our research not only evaluates multiple ML techniques but also emphasizes their clinical applicability and interpretability, which are critical for real-world implementation. Future work will focus on validating these models with external datasets and integrating them into decision-support systems for early risk detection.
Idioma: Inglés
DOI: 10.3390/technologies13060222
Año: 2025
Publicado en: TECHNOLOGIES 13, 6 (2025), 222 [18 pp.]
ISSN: 2227-7080

Financiación: info:eu-repo/grantAgreement/ES/DGA/T31-20R
Financiación: info:eu-repo/grantAgreement/ES/MCINN/PID2022-136476OB-I00
Tipo y forma: Article (Published version)
Área (Departamento): Área Ingeniería Telemática (Dpto. Ingeniería Electrón.Com.)

Creative Commons You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.


Exportado de SIDERAL (2025-10-17-14:16:52)


Visitas y descargas

Este artículo se encuentra en las siguientes colecciones:
Articles > Artículos por área > Ingenieria Telematica



 Record created 2025-07-10, last modified 2025-10-17


Versión publicada:
 PDF
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)