Comparison of Different Machine Learning Algorithms for Breast Cancer Recurrence Classification

M Haskul, Emine Yaman

Abstract


In this paper we compared some machine learning algorithms to predict recurrence of breast cancer and see which model used gives best accuracy for the prediction. In this study we used database donated by University Medical Centre, Institute of Oncology, Ljubljana, Slovenia. The preprocessed dataset includes 286 instances, 9 attributes and 1 class attribute. Firstly, we used attribute evaluation to see which attribute is more effective on class attribute. Secondly we have explored three different algorithms: C4.5, Random Forest and K Nearest Neighbor. Several data mining tools have been applied with these 3 algorithms to explore which model is better on accuracy. Finally we have found that C4.5 algorithm is the best for our dataset: breast cancer recurrence.

Keywords


data mining; breast cancer; classification; machine learning; Weka; C4.5; random forest; KNN

Full Text:

PDF


DOI: http://dx.doi.org/10.21533/scjournal.v8i2.179

Refbacks

  • There are currently no refbacks.


Copyright (c) 2019 M Haskul

ISSN 2233 -1859

Digital Object Identifier DOI: 10.21533/scjournal

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License