COMPARISON OF MACHINE LEARNING TECHNIQUES IN SPAM E-MAIL CLASSIFICATION

Samed Jukic; Jasmin Azemovic; Dino Keco; Jasmin Kevric

doi:10.21533/scjournal.v4i1.88

COMPARISON OF MACHINE LEARNING TECHNIQUES IN SPAM E-MAIL CLASSIFICATION

Samed Jukic, Jasmin Azemovic, Dino Keco, Jasmin Kevric

Abstract

E-mail still proves to be very popular and an efficient communication tool. Due to its misuse, however, managing e-mails is an important problem for organizations and individuals. Spam, known as unwanted message, is an example of misuse. Specifically, spam is defined as the arrival of unwelcomed bulk email not being requested for by recipients. This paper compares different Machine Learning Techniques in classification of spam e-mails. Random Forest (RF), C4.5 decision tree and Artificial Neural Network (ANN) were tested to determine which method provides the best results in spam e-mail classification. Our results show that RF is the best technique applied on dataset from HP Labs, indicating that ensemble methods may have an edge in spam detection

Keywords

Random Forest (RF), C4.5 Decision Tree, Artificial Neural Network (ANN), Spam Detection, Ensemble methods

Full Text:

PDF

DOI: http://dx.doi.org/10.21533/scjournal.v4i1.88

Refbacks

There are currently no refbacks.

Digital Object Identifier DOI: 10.21533/scjournal

This work is licensed under a Creative Commons Attribution 4.0 International License

Username
Password
Remember me

Southeast Europe Journal of Soft Computing

COMPARISON OF MACHINE LEARNING TECHNIQUES IN SPAM E-MAIL CLASSIFICATION

Abstract

Keywords

Full Text:

Refbacks