Enhancing Early Detection of Heart Disease through Machine Learning: Accuracy, Challenges, and Implications for Healthcare

No Thumbnail Available
Shibu, Steny Rafsana Job
Issue Date
MSc in Data Analytics
Dublin Business School
Heart diseases are common in patients these days and heart disease are among the leading causes of deaths globally. By tracking the various health parameters, it is possible that the heart disease among patients can be detected earlier. This research is focused on use of machine learning techniques to predict heart disease at an early stage, using comprehensive dataset that includes health parameters. This research is conducted and completed by using the CRISP-DM framework that encompasses phase from business understanding to deployment of models. The dataset used in this study comprise of 14 attributes demonstrating the different health features of patients related to heart health. Some of the attributes are resting blood pressure, chest pain type, maximum heart rate, age, cholesterol level etc. This report includes the initial exploration of the dataset following the EDA approaches to better understand the data. The data imbalance issue is handled by implementing the SMOTE technique. There are total of five ML models have been created for detection of the heart disease. These models are logistic regression, SVM, KNN, Random Forest and ensemble model. For the evaluation of the constructed model, common performance metrics such as accuracy, precision, recall, F1-Score and AUC score have been used. All the models are fine-tuned by using the GridSearchCV to maximize their capability and performance. After evaluation, it is found that the most effective and efficient model for prediction of the heart disease is the ensemble model with an overall accuracy of 87.91%. This model is also reliable as the recall of this model is 93.33%.