Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani
A text classifier model optimized for short snippets like tweets is developed to make bilingual sentiment analysis possible. The two languages explored are Bahasa Malaysia and English, since they are the two most commonly spoken languages in Malaysia. The classifier model is trained and tested on a...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://ir.uitm.edu.my/id/eprint/31488/1/31488.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-uitm-ir.31488 |
---|---|
record_format |
uketd_dc |
spelling |
my-uitm-ir.314882020-06-26T04:18:05Z Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani 2020 Abdullah Sani, Aidil Amirul Safwan Social groups. Group dynamics Twitter Communication. Mass media A text classifier model optimized for short snippets like tweets is developed to make bilingual sentiment analysis possible. The two languages explored are Bahasa Malaysia and English, since they are the two most commonly spoken languages in Malaysia. The classifier model is trained and tested on a huge multi domain dataset pre-labelled with the labels “0” and “1”, which resemble “positive” and “negative” respectively. Naïve Bayes ML technique is used as the core of the classifier model. The data are all pre-processed, and once the development of the classifier model is done, the model is run on real-time data, which are public tweets directly or indirectly mentioned to the three biggest CSP in Malaysia, which are Celcom, Maxis and Digi in the year of 2018. The result of the analysis is incorporated into a web application built on Bootstrap on top of Python’s Flask allowing interactive data visualization. Agile methodology is used throughout the development of the application to ensure that this project is done according to the guideline prepared in the design phase. Functionality testing is also done to ensure that there is no significant error that will render the application useless. In conclusion, the findings gathered show that Naïve Bayes is fairly suitable to be used in NLP problems. The future work that can be put into this project is to improve the corpus to include different slangs of Bahasa Malaysia and commonly used short forms as well as adding an extra class to represent texts that do not belong to either “positive” or “negative”. 2020 Thesis https://ir.uitm.edu.my/id/eprint/31488/ https://ir.uitm.edu.my/id/eprint/31488/1/31488.pdf text en public degree Universiti Teknologi MARA, Cawangan Melaka Faculty of Computer and Mathematical Sciences Abu Samah, Khyrina Airin Fariza |
institution |
Universiti Teknologi MARA |
collection |
UiTM Institutional Repository |
language |
English |
advisor |
Abu Samah, Khyrina Airin Fariza |
topic |
Social groups Group dynamics Social groups Group dynamics |
spellingShingle |
Social groups Group dynamics Social groups Group dynamics Abdullah Sani, Aidil Amirul Safwan Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani |
description |
A text classifier model optimized for short snippets like tweets is developed to make bilingual sentiment analysis possible. The two languages explored are Bahasa Malaysia and English, since they are the two most commonly spoken languages in Malaysia. The classifier model is trained and tested on a huge multi domain dataset pre-labelled with the labels “0” and “1”, which resemble “positive” and “negative” respectively. Naïve Bayes ML technique is used as the core of the classifier model. The data are all pre-processed, and once the development of the classifier model is done, the model is run on real-time data, which are public tweets directly or indirectly mentioned to the three biggest CSP in Malaysia, which are Celcom, Maxis and Digi in the year of 2018. The result of the analysis is incorporated into a web application built on Bootstrap on top of Python’s Flask allowing interactive data visualization. Agile methodology is used throughout the development of the application to ensure that this project is done according to the guideline prepared in the design phase. Functionality testing is also done to ensure that there is no significant error that will render the application useless. In conclusion, the findings gathered show that Naïve Bayes is fairly suitable to be used in NLP problems. The future work that can be put into this project is to improve the corpus to include different slangs of Bahasa Malaysia and commonly used short forms as well as adding an extra class to represent texts that do not belong to either “positive” or “negative”. |
format |
Thesis |
qualification_level |
Bachelor degree |
author |
Abdullah Sani, Aidil Amirul Safwan |
author_facet |
Abdullah Sani, Aidil Amirul Safwan |
author_sort |
Abdullah Sani, Aidil Amirul Safwan |
title |
Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani |
title_short |
Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani |
title_full |
Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani |
title_fullStr |
Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani |
title_full_unstemmed |
Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani |
title_sort |
visualizing the reputation of malaysian communication service providers through twitter sentiment analysis using naïve bayes / aidil amirul safwan abdullah sani |
granting_institution |
Universiti Teknologi MARA, Cawangan Melaka |
granting_department |
Faculty of Computer and Mathematical Sciences |
publishDate |
2020 |
url |
https://ir.uitm.edu.my/id/eprint/31488/1/31488.pdf |
_version_ |
1783734122246569984 |