Spam detection in email body using hybrid of artificial neural network and evolutionary algorithms

Spam detection is a significant problem that is considered by many researchers through various developed strategies. Creating a particular model to categorize the wide range of spam categories is complex; with understanding of spam types, which are always changing. In spam detection, low accuracy an...

Full description

Saved in:
Bibliographic Details
Main Author: Ali Albshayreh, Ali Otman
Format: Thesis
Language:English
Published: 2015
Subjects:
Online Access:http://eprints.utm.my/id/eprint/53536/25/AliOtmanAliMFC2015.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Spam detection is a significant problem that is considered by many researchers through various developed strategies. Creating a particular model to categorize the wide range of spam categories is complex; with understanding of spam types, which are always changing. In spam detection, low accuracy and the high false positive are substantial problems. So the trend to hire a global optimization algorithm is an appropriate way to resolve these problems due to its ability to create new solutions and non-compliance with local solutions. In this study, a hybrid machine learning approach inspired by Artificial Neural Network (ANN) and Differential Evolution (DE) are designed for effectively detect the spams. Comparisons have been done between ANN-DE with Genetic Algorithm (GA) and ANN-DE with InfoGain algorithm to show which approach has the best performance in spam detection. Spambase dataset of 4061 E-mail in which 1813 were spam (39.40%) and 2788 were non-spam (59.60%) were used to training and testing on these algorithms. The popular performance measure is a classification accuracy, which deals with false positive, false negative, accuracy, precision, and recall. These metrics were used for performance evaluation on the hybrid of ANN-DE with GA and InfoGain algorithm as feature selection algorithms. Performance of ANN-DE with GA and ANN-DE with InfoGain are compared. The experimental results show that the proposed hybrid technique of ANN-DE and GA gives better result with 93.81% accuracy compared to ANN-DE and InfoGain with 93.28% accuracy. The results recommend that the effectiveness of proposed ANN-DE with GA is promising and this study provided a new method to practically train ANN for spam detection.