Stacked Ensemble Model with Smote for Heart Disease Prediction
Keywords:
Heart Disease Prediction, Stacked Ensemble, SMOTE, Machine Learning, Medical Diagnostics, Class Imbalance, Logistic Regression, Random Forest, Decision TreeAbstract
Cardiovascular disease, particularly heart disease, remains a leading cause of global mortality, with early detection being critical to improving patient outcomes. Traditional machine learning models for heart disease prediction often suffer from class imbalance and limited generalization capabilities. This study aims to develop a robust, interpretable, and computationally efficient predictive model that addresses these limitations. The proposed approach integrates the Synthetic Minority Over-sampling Technique (SMOTE) with a stacked ensemble learning architecture composed of Decision Tree and Random Forest as base learners, and Logistic Regression as a meta-learner. A standardized preprocessing pipeline involving median imputation, Min-Max normalization, and one-hot encoding was applied to a clinical dataset of 1,025 patient records sourced from the Kaggle UCI repository. SMOTE was utilized to balance the minority class representing heart disease cases. The model was evaluated using 5-fold stratified cross-validation on key performance metrics. The stacked ensemble achieved an accuracy of 98.2%, precision of 1.00, recall of 0.96, F1-score of 0.98, and AUC-ROC of 0.99, significantly outperforming standalone models and recent ensemble-based methods. Implementation required minimal computational resources and executed efficiently on CPU-only systems, making it suitable for real-time clinical applications. The study demonstrates that combining data-level balancing and model-level stacking significantly enhances diagnostic accuracy, particularly in class-imbalanced medical datasets. Future work will explore explainable AI integration, validation on diverse clinical populations, and deployment on resource-constrained edge devices.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. You are free to share and adapt the material, but only for non-commercial purposes. You must give appropriate credit to the author(s).

