Diabetes Detection Using Stacking Technique: A Combination of XGBoost, Gradient Boosting, and Meta Model
DOI:
https://doi.org/10.35314/48asdy77Keywords:
Diabetes, Smote-Tomek, XgBoost, GradientBoosting, StackingAbstract
Type 2 diabetes mellitus is a chronic and progressively increasing global health issue that necessitates early detection to mitigate serious complications such as kidney failure, neuropathy, and cardiovascular disorders. While numerous studies have developed predictive models using machine learning techniques, many are limited by their reliance on single algorithms and inadequate handling of class imbalance. This research introduces a novel strategy by employing an ensemble stacking method that integrates Gradient Boosting, XGBoost, and Random Forest, with Random Forest acting as the meta-learner. The dataset, comprising 100,000 patient records, underwent preprocessing and was balanced using the SMOTE-Tomek approach to correct class distribution disparities. The stacking process is implemented in two phases: base models generate preliminary predictions, which are subsequently used as input for the meta-model to refine the final outcomes. The evaluation demonstrates that the stacking model achieves superior performance, recording 98% accuracy and an F1-score of 0.98, outperforming the individual models. The key distinction of this study lies in the effective application of ensemble stacking to enhance prediction accuracy, especially in dealing with imbalanced and complex medical data. This methodology has the potential to improve clinical decision support systems, making them more accurate and responsive.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 INOVTEK Polbeng - Seri Informatika

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.