Heart Attack Prediction Model Based on Feature Selection and Decision Tree Approaches

Hussein Abdullah Jaber, Mortada Sadoun Thabet, Rabab Abdul Hussein Fahd, Dalal Khatib Muhbis, Alaa Khalaf Hamoud

Abstract


The purpose of this study is creating a machine learning based model is to predict heart attacks is to improve the capacity to anticipate the occurrence of this dangerous medical condition. It is feasible to find significant and linked variables that may cause heart attacks by using the decision tree as a tool for medical data analysis. The system analyzes clinical data using artificial intelligence techniques to find patterns that might suggest the possibility of a heart attack. The advantage is early disease detection and prediction, which allows the medical staff to better plan treatment and take preventative action. This kind of system can aid in enhancing patient care and lowering the likelihood that. Throuought the study, two paths will be examined, the first one is applying machine learning algorithms without applying feature selection, and the second one with feature selection process. Three mainly feature selection algorithms will be examined to find the most correlated features that affect the heart attack. The model will examine six machine learning decision tree algorithms namely (decision stump, hoeffding tree, j48, LMT, random forest, and rep tree) to find the accurate algorithm in prediction. The results show that LMT have the accurate prediction accuracy with 82.5%.

Keywords


Heart Attack; Feature Selection; Machine Learning; Decision Tree; LMT; Weka.

Full Text:

PDF

References


M. W. Eladham, A. B. Nassif, and M. A. AlShabi, “Heart attack prediction using machine learning,” in Smart Biomedical and Physiological Sensor Technology XX, 2023, pp. 86–93.

G. Thilagavathi, S. Priyanka, V. Roopa, and J. S. Shri, “Heart disease prediction using machine learning algorithms,” in 2022 International Conference on Applied Artificial Intelligence and Computing (ICAAIC), 2022, pp. 494–501.

K. Arumugam, M. Naved, P. P. Shinde, O. Leiva-Chauca, A. Huaman-Osorio, and T. Gonzales-Yanac, “Multiple disease prediction using Machine learning algorithms,” Mater Today Proc, vol. 80, 2023, doi: 10.1016/j.matpr.2021.07.361.

C. Gupta, A. Saha, N. V. S. Reddy, and U. D. Acharya, “Cardiac Disease Prediction using Supervised Machine Learning Techniques,” in Journal of Physics: Conference Series, 2022. doi: 10.1088/1742-6596/2161/1/012013.

A. Purnomo, M. A. Barata, M. A. Soeleman, and F. Alzami, “Adding feature selection on Naïve Bayes to increase accuracy on classification heart attack disease,” in Journal of Physics: Conference Series, 2020. doi: 10.1088/1742-6596/1511/1/012001.

O. Shakir and I. Saleh, “Hybridization of Swarm for Features Selection to Modeling Heart Attack Data,” AL-Rafidain Journal of Computer Sciences and Mathematics, vol. 16, no. 2, 2022, doi: 10.33899/csmj.2022.176587.

H. Takci, “Improvement of heart attack prediction by the feature selection methods,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 26, no. 1, 2018, doi: 10.3906/elk-1611-235.

Z. J. Kovacic, “Early Prediction of Student Success: Mining Students Enrolment Data,” in Proceedings of the 2010 InSITE Conference, 2010. doi: 10.28945/1281.

A. K. HAMOUD, “CLASSIFYING STUDENTS’ANSWERS USING CLUSTERING ALGORITHMS BASED ON PRINCIPLE COMPONENT ANALYSIS.,” J Theor Appl Inf Technol, vol. 96, no. 7, 2018.

A. Khalaf Hamoud and A. Majeed Humadi, “STUDENT’S SUCCESS PREDICTION MODEL BASED ON ARTIFICIAL NEURAL NETWORKS (ANN) AND A COMBINATION OF FEATURE SELECTION METHODS,” Journal of Southwest Jiaotong University, vol. 54, no. 3, 2019, doi: 10.35741/issn.0258-2724.54.3.25.

A. Hamoud, A. S. Hashim, and W. A. Awadh, “Predicting student performance in higher education institutions using decision tree analysis,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 5, pp. 26–31, 2018.

A. K. Hamoud, “Applying Association Rules and Decision Tree Algorithms with Tumor Diagnosis Data,” International Research Journal of Engineering and Technology, vol. 3, no. 8, pp. 27–31, 2016.

U. Pehlivan, N. Baltaci, C. Acarturk, and N. Baykal, “The analysis of feature selection methods and classification algorithms in permission based Android malware detection,” in IEEE SSCI 2014: 2014 IEEE Symposium Series on Computational Intelligence - CICS 2014: 2014 IEEE Symposium on Computational Intelligence in Cyber Security, Proceedings, 2014. doi: 10.1109/CICYBS.2014.7013371.

T. E. Mathew, “Appositeness of Hoeffding tree models for breast cancer classification,” J Curr Sci Technol, vol. 12, no. 3, 2022.

S. Bishnoi and B. K. Hooda, “Decision Tree Algorithms and their Applicability in Agriculture for Classification,” Journal of Experimental Agriculture International, 2022, doi: 10.9734/jeai/2022/v44i730833.

O. Abrishambaf, P. Faria, Z. Vale, and J. M. Corchado, “Energy scheduling using decision trees and emulation: Agriculture irrigation with run-of-the-river hydroelectricity and a PV case study,” Energies (Basel), vol. 12, no. 20, 2019, doi: 10.3390/en12203987.

A. Saboor et al., “A method for improving prediction of human heart disease using machine learning algorithms,” Mobile Information Systems, vol. 2022, 2022.

C. M. Bhatt, P. Patel, T. Ghetia, and P. L. Mazzeo, “Effective heart disease prediction using machine learning techniques,” Algorithms, vol. 16, no. 2, p. 88, 2023.

H. Jindal, S. Agrawal, R. Khera, R. Jain, and P. Nagrath, “Heart disease prediction using machine learning algorithms,” in IOP conference series: materials science and engineering, 2021, p. 12072.

B. J. Saleh, R. R. K. Al_Taie, and A. A. Mhawes, “Machine Learning Architecture for Heart Disease Detection: A Case Study in Iraq,” International journal of online and biomedical engineering, vol. 18, no. 2, 2022, doi: 10.3991/ijoe.v18i02.27143.

R. K. Abd, S. N. Abd, and V. Raman, “Tracing the risk factors of heart diseases at al-Nasiriyah heart center in Iraq,” J Cardiovasc Dis Res, vol. 10, no. 1, 2019, doi: 10.5530/jcdr.2019.1.6.

M. D. Ahmed, I. H. Hameed, and M. Q. Abd-Ali, “Prospective and retrospective study of the acute heart attack cases in marjan hospital-hillah city-Iraq,” Res J Pharm Technol, vol. 10, no. 10, 2017, doi: 10.5958/0974-360X.2017.00606.0.

B. O. Sharif and S. Y. Lafi, “Common Risk Factors of Myocardial Infarction and Some Socio Demographic Characteristics in Sulaimani City,” Kurdistan Journal of Applied Research, 2022, doi: 10.24017/science.2021.2.13.

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA data mining software,” ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, 2009, doi: 10.1145/1656274.1656278.

E. Frank, M. A. Hall, and I. H. Witten, “The WEKA Workbench Data Mining: Practical Machine Learning Tools and Techniques,” Morgan Kaufmann, Fourth Edition, 2016.

S. Kalmegh, “Analysis of WEKA Data Mining Algorithm REPTree , Simple Cart and RandomTree for Classification of Indian News,” International Journal of Innovative Science, Engineering & Technology, vol. 2, no. 2, 2015.

K. Rijhwani, V. R. Mohanty, Y. B. Aswini, V. Singh, and S. Hashmi, “Applicability of data mining and predictive analysis for tobacco cessation: An exploratory study,” Front Dent, vol. 17, 2020, doi: 10.18502/fid.v17i24.4624.

P. P. Shinde, K. S. Oza, and R. K. Kamat, “Systematic acuity of medicinal big data: need of health industry,” Pers Ubiquitous Comput, vol. 27, no. 3, 2023, doi: 10.1007/s00779-022-01681-1.

S. Alija, E. Beqiri, A. S. Gaafar, and A. K. Hamoud, “Predicting Students Performance Using Supervised Machine Learning Based on Imbalanced Dataset and Wrapper Feature Selection,” Informatica, vol. 47, no. 1, 2023.

L. Rashid, “Towards successful entrepreneurial outcomes amidst extreme fragility,” 2020.

Rashik Rahman, “Heart Attack Analysis & Prediction Dataset,” https://www.kaggle.com/datasets/rashikrahmanpritom/heart-attack-analysis-prediction-dataset/data.

C. Lee and G. G. Lee, “Information gain and divergence-based feature selection for machine learning-based text categorization,” Inf Process Manag, vol. 42, no. 1, pp. 155–165, 2006.

H. Uğuz, “A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm,” Knowl Based Syst, vol. 24, no. 7, pp. 1024–1032, 2011.

N. A. Suwadi et al., “An Optimized Approach for Predicting Water Quality Features Based on Machine Learning,” Wirel Commun Mob Comput, vol. 2022, 2022, doi: 10.1155/2022/3397972.

H. H. Htun, M. Biehl, and N. Petkov, “Survey of feature selection and extraction techniques for stock market prediction,” Financial Innovation, vol. 9, no. 1. 2023. doi: 10.1186/s40854-022-00441-7.

W. Wang, M. Guo, T. Han, and S. Ning, “A novel feature selection method considering feature interaction in neighborhood rough set,” Intelligent Data Analysis, vol. 27, no. 2, 2023, doi: 10.3233/IDA-216447.

A. Sikri, N. P. Singh, and S. Dalal, “Analysis of Rank Aggregation Techniques for Rank Based on the Feature Selection Technique,” International Journal on Recent and Innovation Trends in Computing and Communication, vol. 11, 2023, doi: 10.17762/ijritcc.v11i3s.6160.

B. Niu, J. Sun, and B. Yang, “Multisensory based tool wear monitoring for practical applications in milling of titanium alloy,” in Materials Today: Proceedings, 2020. doi: 10.1016/j.matpr.2019.12.126.

M. Kamaladevi and V. Venkatraman, “Tversky Similarity based Under Sampling with Gaussian Kernelized Decision Stump Adaboost Algorithm for Imbalanced Medical Data Classification,” International Journal of Computers, Communications and Control, vol. 16, no. 6, 2021, doi: 10.15837/IJCCC.2021.6.4291.

E. García-Martín, A. Bifet, and N. Lavesson, “Energy modeling of Hoeffding tree ensembles,” Intelligent Data Analysis, vol. 25, no. 1, 2021, doi: 10.3233/IDA-194890.

W. Chen et al., “A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility,” Catena (Amst), vol. 151, 2017, doi: 10.1016/j.catena.2016.11.032.

V. H. Nhu et al., “Shallow landslide susceptibility mapping: A comparison between logistic model tree, logistic regression, naïve bayes tree, artificial neural network, and support vector machine algorithms,” Int J Environ Res Public Health, vol. 17, no. 8, 2020, doi: 10.3390/ijerph17082749.

L. Breiman, “Random Forests,” Mach Learn, vol. 45, no. 1, pp. 5–32, 2001, doi: 10.1023/A:1010933404324.

J. F. Le Gall, “Random trees and applications,” Probability Surveys, vol. 2, no. 1, 2005, doi: 10.1214/154957805100000140.

S. K. Jayanthi and S. Sasikala, “Reptree classifier for identifying link spam in web search engines,” IJSC, vol. 3, no. 2, pp. 498–505, 2013.




DOI: http://dx.doi.org/10.21533/scjournal.v13i1.278

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Hussein Abdullah Jaber, Mortada Sadoun Thabet, Rabab Abdul Hussein Fahd, Dalal Khatib Muhbis, Alaa Khalaf Hamoud

ISSN 2233 -1859

Digital Object Identifier DOI: 10.21533/scjournal

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License