inner-banner-bg

International Journal of Diabetes & Metabolic Disorders(IJDMD)

ISSN: 2475-5451 | DOI: 10.33140/IJDMD

Impact Factor: 1.23

Enhancing Diabetes Prediction through a Hybrid Deep Learning and Machine Learning Ensemble Using a Two-Stage Soft Voting

Abstract

Md Ziarul Islam, Zariya Ahmed Udaisa, Mohd Khairul Azmi Bin Hassan, and Amir 'Aatieff Bin Amir Hussin

Objective: This study aims to enhance the accuracy and robustness of diabetes prediction by developing a hybrid ensemble model that integrates both Deep Learning (DL) and Machine Learning (ML) classifiers through a two-stage soft voting mechanism.

Research Methodology: The proposed methodology involves a comprehensive preprocessing pipeline, including label encoding for categorical features and standardization of numerical variables. Three DL architectures, Convolutional Neural Network (CNN), Feedforward Neural Network (FNN), and Ensemble Neural Networks (ENN), are independently trained alongside three ML classifiers: Logistic Regression (LR), Random Forest (RF), and XGBoost. Soft voting is applied separately within the DL and ML groups, and the resulting predictions are combined in a final hybrid soft voting ensemble. The benchmark Kaggle diabetes_prediction_dataset.csv file is used to train and test the system in this proposed method. The models are evaluated using six performance metrics: accuracy, precision, recall, F1 score, ROC- AUC, and Cohen’s kappa.

Results: The proposed hybrid ensemble model outperformed all individual and grouped models, achieving an accuracy of 0.9707, an F1 score of 0.9495, a ROC-AUC of 0.9832, and a Cohen’s kappa of 0.9361. Both internal ensemble layers DL and ML soft voting also demonstrated high predictive performance, validating the layered ensemble approach.

Conclusion: The dual-stage hybrid soft voting ensemble effectively combines the complementary strengths of DL and ML techniques, offering a highly accurate and generalizable solution for diabetes prediction. These findings suggest that the proposed model is well-suited for integration into intelligent clinical decision support systems.

PDF