Diabetes Detection Using Stacking Technique: A Combination of XGBoost, Gradient Boosting, and Meta Model

Authors

  • Aden Rahmat Aden Rahmat UNIVERSITAS DIAN NUSWANTORO Author
  • Danang Wahyu Utomo UNIVERSITAS DIAN NUSWANTORO Author

DOI:

https://doi.org/10.35314/48asdy77

Keywords:

Diabetes, Smote-Tomek, XgBoost, GradientBoosting, Stacking

Abstract

Type 2 diabetes mellitus is a chronic and progressively increasing global health issue that necessitates early detection to mitigate serious complications such as kidney failure, neuropathy, and cardiovascular disorders. While numerous studies have developed predictive models using machine learning techniques, many are limited by their reliance on single algorithms and inadequate handling of class imbalance. This research introduces a novel strategy by employing an ensemble stacking method that integrates Gradient Boosting, XGBoost, and Random Forest, with Random Forest acting as the meta-learner. The dataset, comprising 100,000 patient records, underwent preprocessing and was balanced using the SMOTE-Tomek approach to correct class distribution disparities. The stacking process is implemented in two phases: base models generate preliminary predictions, which are subsequently used as input for the meta-model to refine the final outcomes. The evaluation demonstrates that the stacking model achieves superior performance, recording 98% accuracy and an F1-score of 0.98, outperforming the individual models. The key distinction of this study lies in the effective application of ensemble stacking to enhance prediction accuracy, especially in dealing with imbalanced and complex medical data. This methodology has the potential to improve clinical decision support systems, making them more accurate and responsive.

 

Downloads

Download data is not yet available.

Downloads

Published

07-06-2025

Issue

Section

Articles

How to Cite

Diabetes Detection Using Stacking Technique: A Combination of XGBoost, Gradient Boosting, and Meta Model. (2025). INOVTEK Polbeng - Seri Informatika, 10(2), 912-921. https://doi.org/10.35314/48asdy77