Islamic web pages filtering and categorization

The Internet creates the world without boundaries where people can get lots of information just by surfing the Internet. But still some of the information is not genuine and correct. Because of that, some of the practitioners of deviant teachings can take this opportunity to attract followers just u...

Full description

Saved in:
Bibliographic Details
Main Author: Mohd. Zamry, Nurfazrina
Format: Thesis
Language:English
Published: 2013
Subjects:
Online Access:http://eprints.utm.my/id/eprint/35863/5/NurFazrinaMohdZamryMFSKSM2013.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.35863
record_format uketd_dc
spelling my-utm-ep.358632017-07-17T05:05:56Z Islamic web pages filtering and categorization 2013-06 Mohd. Zamry, Nurfazrina TK5015.888 Web sites The Internet creates the world without boundaries where people can get lots of information just by surfing the Internet. But still some of the information is not genuine and correct. Because of that, some of the practitioners of deviant teachings can take this opportunity to attract followers just using the Internet especially to distort beliefs of Muslim in Malaysia. Web filtering can be used as protection against inappropriate and prevention of misuse of the network, hence, it can be used to filter the content of suspicious websites and alleviate the dissemination of such website. Currently, process for blocking the deviate teaching website is done manually and in addition there are limited web filtering product offered to filter religion content and very limited for Malay language. This project is aim to classify deviant teachings Website into three categories which is deviate, suspicious and clean. Pre-processing, feature selection and classification are process involved in Web filtering process. In pre-processing three processes are involved: HTML parsing, stemming and stopping to produce the deviant teaching keyword. Three existing term weighting scheme namely TF, TFIDF and Modified Entropy are used as feature selection process in filtering deviant teaching website while Support Vector Machine (SVM) will be used for classification process. Classification is validated by accuracy, precision, recall and F1. 300 Web pages were collected from Internet based on three categories: deviant teaching, suspicious and clean Web pages. As a result, M.Entropy shows the most suitable term weighting scheme to use in Islamic web pages filtering rather than TFIDF and Entropy. 2013-06 Thesis http://eprints.utm.my/id/eprint/35863/ http://eprints.utm.my/id/eprint/35863/5/NurFazrinaMohdZamryMFSKSM2013.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:70009?site_name=Restricted Repository masters Universiti Teknologi Malaysia, Faculty of Computing Faculty of Computing
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic TK5015.888 Web sites
spellingShingle TK5015.888 Web sites
Mohd. Zamry, Nurfazrina
Islamic web pages filtering and categorization
description The Internet creates the world without boundaries where people can get lots of information just by surfing the Internet. But still some of the information is not genuine and correct. Because of that, some of the practitioners of deviant teachings can take this opportunity to attract followers just using the Internet especially to distort beliefs of Muslim in Malaysia. Web filtering can be used as protection against inappropriate and prevention of misuse of the network, hence, it can be used to filter the content of suspicious websites and alleviate the dissemination of such website. Currently, process for blocking the deviate teaching website is done manually and in addition there are limited web filtering product offered to filter religion content and very limited for Malay language. This project is aim to classify deviant teachings Website into three categories which is deviate, suspicious and clean. Pre-processing, feature selection and classification are process involved in Web filtering process. In pre-processing three processes are involved: HTML parsing, stemming and stopping to produce the deviant teaching keyword. Three existing term weighting scheme namely TF, TFIDF and Modified Entropy are used as feature selection process in filtering deviant teaching website while Support Vector Machine (SVM) will be used for classification process. Classification is validated by accuracy, precision, recall and F1. 300 Web pages were collected from Internet based on three categories: deviant teaching, suspicious and clean Web pages. As a result, M.Entropy shows the most suitable term weighting scheme to use in Islamic web pages filtering rather than TFIDF and Entropy.
format Thesis
qualification_level Master's degree
author Mohd. Zamry, Nurfazrina
author_facet Mohd. Zamry, Nurfazrina
author_sort Mohd. Zamry, Nurfazrina
title Islamic web pages filtering and categorization
title_short Islamic web pages filtering and categorization
title_full Islamic web pages filtering and categorization
title_fullStr Islamic web pages filtering and categorization
title_full_unstemmed Islamic web pages filtering and categorization
title_sort islamic web pages filtering and categorization
granting_institution Universiti Teknologi Malaysia, Faculty of Computing
granting_department Faculty of Computing
publishDate 2013
url http://eprints.utm.my/id/eprint/35863/5/NurFazrinaMohdZamryMFSKSM2013.pdf
_version_ 1747816374189686784