Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani

A text classifier model optimized for short snippets like tweets is developed to make bilingual sentiment analysis possible. The two languages explored are Bahasa Malaysia and English, since they are the two most commonly spoken languages in Malaysia. The classifier model is trained and tested on a...

全面介绍

Saved in:

书目详细资料
主要作者:	Abdullah Sani, Aidil Amirul Safwan
格式:	Thesis
语言:	English
出版:	2020
主题:	Social groups Group dynamics Twitter
在线阅读:	https://ir.uitm.edu.my/id/eprint/31488/1/31488.pdf
标签:	添加标签没有标签, 成为第一个标记此记录!

实物特征
总结:	A text classifier model optimized for short snippets like tweets is developed to make bilingual sentiment analysis possible. The two languages explored are Bahasa Malaysia and English, since they are the two most commonly spoken languages in Malaysia. The classifier model is trained and tested on a huge multi domain dataset pre-labelled with the labels “0” and “1”, which resemble “positive” and “negative” respectively. Naïve Bayes ML technique is used as the core of the classifier model. The data are all pre-processed, and once the development of the classifier model is done, the model is run on real-time data, which are public tweets directly or indirectly mentioned to the three biggest CSP in Malaysia, which are Celcom, Maxis and Digi in the year of 2018. The result of the analysis is incorporated into a web application built on Bootstrap on top of Python’s Flask allowing interactive data visualization. Agile methodology is used throughout the development of the application to ensure that this project is done according to the guideline prepared in the design phase. Functionality testing is also done to ensure that there is no significant error that will render the application useless. In conclusion, the findings gathered show that Naïve Bayes is fairly suitable to be used in NLP problems. The future work that can be put into this project is to improve the corpus to include different slangs of Bahasa Malaysia and commonly used short forms as well as adding an extra class to represent texts that do not belong to either “positive” or “negative”.

Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani

相似书籍