Islamic web pages filtering and categorization
The Internet creates the world without boundaries where people can get lots of information just by surfing the Internet. But still some of the information is not genuine and correct. Because of that, some of the practitioners of deviant teachings can take this opportunity to attract followers just u...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/35863/5/NurFazrinaMohdZamryMFSKSM2013.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-utm-ep.35863 |
---|---|
record_format |
uketd_dc |
spelling |
my-utm-ep.358632017-07-17T05:05:56Z Islamic web pages filtering and categorization 2013-06 Mohd. Zamry, Nurfazrina TK5015.888 Web sites The Internet creates the world without boundaries where people can get lots of information just by surfing the Internet. But still some of the information is not genuine and correct. Because of that, some of the practitioners of deviant teachings can take this opportunity to attract followers just using the Internet especially to distort beliefs of Muslim in Malaysia. Web filtering can be used as protection against inappropriate and prevention of misuse of the network, hence, it can be used to filter the content of suspicious websites and alleviate the dissemination of such website. Currently, process for blocking the deviate teaching website is done manually and in addition there are limited web filtering product offered to filter religion content and very limited for Malay language. This project is aim to classify deviant teachings Website into three categories which is deviate, suspicious and clean. Pre-processing, feature selection and classification are process involved in Web filtering process. In pre-processing three processes are involved: HTML parsing, stemming and stopping to produce the deviant teaching keyword. Three existing term weighting scheme namely TF, TFIDF and Modified Entropy are used as feature selection process in filtering deviant teaching website while Support Vector Machine (SVM) will be used for classification process. Classification is validated by accuracy, precision, recall and F1. 300 Web pages were collected from Internet based on three categories: deviant teaching, suspicious and clean Web pages. As a result, M.Entropy shows the most suitable term weighting scheme to use in Islamic web pages filtering rather than TFIDF and Entropy. 2013-06 Thesis http://eprints.utm.my/id/eprint/35863/ http://eprints.utm.my/id/eprint/35863/5/NurFazrinaMohdZamryMFSKSM2013.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:70009?site_name=Restricted Repository masters Universiti Teknologi Malaysia, Faculty of Computing Faculty of Computing |
institution |
Universiti Teknologi Malaysia |
collection |
UTM Institutional Repository |
language |
English |
topic |
TK5015.888 Web sites |
spellingShingle |
TK5015.888 Web sites Mohd. Zamry, Nurfazrina Islamic web pages filtering and categorization |
description |
The Internet creates the world without boundaries where people can get lots of information just by surfing the Internet. But still some of the information is not genuine and correct. Because of that, some of the practitioners of deviant teachings can take this opportunity to attract followers just using the Internet especially to distort beliefs of Muslim in Malaysia. Web filtering can be used as protection against inappropriate and prevention of misuse of the network, hence, it can be used to filter the content of suspicious websites and alleviate the dissemination of such website. Currently, process for blocking the deviate teaching website is done manually and in addition there are limited web filtering product offered to filter religion content and very limited for Malay language. This project is aim to classify deviant teachings Website into three categories which is deviate, suspicious and clean. Pre-processing, feature selection and classification are process involved in Web filtering process. In pre-processing three processes are involved: HTML parsing, stemming and stopping to produce the deviant teaching keyword. Three existing term weighting scheme namely TF, TFIDF and Modified Entropy are used as feature selection process in filtering deviant teaching website while Support Vector Machine (SVM) will be used for classification process. Classification is validated by accuracy, precision, recall and F1. 300 Web pages were collected from Internet based on three categories: deviant teaching, suspicious and clean Web pages. As a result, M.Entropy shows the most suitable term weighting scheme to use in Islamic web pages filtering rather than TFIDF and Entropy. |
format |
Thesis |
qualification_level |
Master's degree |
author |
Mohd. Zamry, Nurfazrina |
author_facet |
Mohd. Zamry, Nurfazrina |
author_sort |
Mohd. Zamry, Nurfazrina |
title |
Islamic web pages filtering and categorization |
title_short |
Islamic web pages filtering and categorization |
title_full |
Islamic web pages filtering and categorization |
title_fullStr |
Islamic web pages filtering and categorization |
title_full_unstemmed |
Islamic web pages filtering and categorization |
title_sort |
islamic web pages filtering and categorization |
granting_institution |
Universiti Teknologi Malaysia, Faculty of Computing |
granting_department |
Faculty of Computing |
publishDate |
2013 |
url |
http://eprints.utm.my/id/eprint/35863/5/NurFazrinaMohdZamryMFSKSM2013.pdf |
_version_ |
1747816374189686784 |