A comparative evaluation of machine learning approaches in SMS spam detection

Spam detection is a significant problem which is considered by many researchers by various developed strategies. In this study, the popular performance measure is a classification accuracy which deals with false positive, false negative and accuracy. These metrics were evaluated under applying three...

Full description

Saved in:
Bibliographic Details
Main Author: Salehi, Saber
Format: Thesis
Language:English
Published: 2011
Subjects:
Online Access:http://eprints.utm.my/id/eprint/32801/5/SaberSalehiMFSKSM2011.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.32801
record_format uketd_dc
spelling my-utm-ep.328012018-05-27T07:54:55Z A comparative evaluation of machine learning approaches in SMS spam detection 2011-07 Salehi, Saber HD Industries. Land use. Labor Spam detection is a significant problem which is considered by many researchers by various developed strategies. In this study, the popular performance measure is a classification accuracy which deals with false positive, false negative and accuracy. These metrics were evaluated under applying three supervised learning algorithm (Hybrid of Simple Artificial Immune System (SAIS) and Particle Swarm Optimization (PSO), Naive Bayes Classifier (NBC), Enhanced Genetic Algorithm (EGA)) based on classification of SMS contents were evaluated and compared. In this research, SAIS was hybridized by particle swarm optimization (PSO) for optimizing the performance of SAIS for spam filtering. PSO was used with mutation to reinforce the immune system’s searches to find the best class in exemplar for classification. Results were improved using Hybrid SAIS and PSO. The proposed EGA was to achieve the best chromosomes which were grouped by the keywords. Then, the best chromosome with highest fitness value was selected as classifier. Simulated annealing (SA) was used with classical mutation and crossover to reinforce the efficiency of genetic searches. Achieved results represent the enhanced GA is markedly superior to that of a classical GA. These algorithms were trained and tested on a set of 4601 SMS messages in which 1813 were spams and 2788 were non-spams. Results showed that the proposed EGA technique gave better result compare to the hybrid SAIS and PSO and NBC techniques. Results also showed that the proposed EGA technique gave 99.87% accuracy, and the proposed NBC, hybrid of SAIS and PSO techniques gave 97.457% and 88.33% accuracy, respectively. 2011-07 Thesis http://eprints.utm.my/id/eprint/32801/ http://eprints.utm.my/id/eprint/32801/5/SaberSalehiMFSKSM2011.pdf application/pdf en public masters Universiti Teknologi Malaysia, Faculty of Computer Science and Information System Faculty of Computer Science and Information System
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic HD Industries
Land use
Labor
spellingShingle HD Industries
Land use
Labor
Salehi, Saber
A comparative evaluation of machine learning approaches in SMS spam detection
description Spam detection is a significant problem which is considered by many researchers by various developed strategies. In this study, the popular performance measure is a classification accuracy which deals with false positive, false negative and accuracy. These metrics were evaluated under applying three supervised learning algorithm (Hybrid of Simple Artificial Immune System (SAIS) and Particle Swarm Optimization (PSO), Naive Bayes Classifier (NBC), Enhanced Genetic Algorithm (EGA)) based on classification of SMS contents were evaluated and compared. In this research, SAIS was hybridized by particle swarm optimization (PSO) for optimizing the performance of SAIS for spam filtering. PSO was used with mutation to reinforce the immune system’s searches to find the best class in exemplar for classification. Results were improved using Hybrid SAIS and PSO. The proposed EGA was to achieve the best chromosomes which were grouped by the keywords. Then, the best chromosome with highest fitness value was selected as classifier. Simulated annealing (SA) was used with classical mutation and crossover to reinforce the efficiency of genetic searches. Achieved results represent the enhanced GA is markedly superior to that of a classical GA. These algorithms were trained and tested on a set of 4601 SMS messages in which 1813 were spams and 2788 were non-spams. Results showed that the proposed EGA technique gave better result compare to the hybrid SAIS and PSO and NBC techniques. Results also showed that the proposed EGA technique gave 99.87% accuracy, and the proposed NBC, hybrid of SAIS and PSO techniques gave 97.457% and 88.33% accuracy, respectively.
format Thesis
qualification_level Master's degree
author Salehi, Saber
author_facet Salehi, Saber
author_sort Salehi, Saber
title A comparative evaluation of machine learning approaches in SMS spam detection
title_short A comparative evaluation of machine learning approaches in SMS spam detection
title_full A comparative evaluation of machine learning approaches in SMS spam detection
title_fullStr A comparative evaluation of machine learning approaches in SMS spam detection
title_full_unstemmed A comparative evaluation of machine learning approaches in SMS spam detection
title_sort comparative evaluation of machine learning approaches in sms spam detection
granting_institution Universiti Teknologi Malaysia, Faculty of Computer Science and Information System
granting_department Faculty of Computer Science and Information System
publishDate 2011
url http://eprints.utm.my/id/eprint/32801/5/SaberSalehiMFSKSM2011.pdf
_version_ 1747816063771344896