From Data Imbalance to Precision: SMOTE-Driven Machine Learning for Early Detection of Kidney Disease
DOI:
https://doi.org/10.35314/7jgjmg64Keywords:
Gradient Boosting, Chronic Kidney, SMOTE, Random ForestAbstract
Chronic Kidney Disease (CKD) has become a significant global health issue, with its prevalence rising sharply, particularly in developing countries like Indonesia. According to the Kementrian Kesehatan (KEMENKES), the Synthetic Minority Over-sampling Technique (SMOTE) has been widely adopted to address this. SMOTE generates synthetic samples for the minority class, enhancing the model’s ability to identify high-risk patients. Studies demonstrate SMOTE’s effectiveness, particularly when combined with ensemble learning algorithms like Random Forest and Gradient Boosting. The data collection focused on relevant medical parameters critical for the study, encompassing laboratory test results, diagnostic reports, and clinical observations related to kidney function. This dataset in kidney disease is used to predict whether someone has chronic kidney disease or not with a total sample of 400 data obtained from the Ungaran Regional Hospital and several clinics that can detect kidney disease. Recent research highlights that SMOTE significantly improves model accuracy, with Random Forest achieving 99.30% accuracy. These findings emphasise the importance of data balancing in enhancing diagnostic precision, offering promising avenues for early CKD detection and improved patient outcomes.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 INOVTEK Polbeng - Seri Informatika

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.