Sentiment analysis of domestic violence prediction using Naive Bayes algorithm / Nurulizzah Mohd Rahiman

This research delves into the widespread issue of domestic violence, emphasizing its severe impact on individuals and society globally. The surge in domestic violence during the COVID-19 pandemic, as highlighted by UN Women's survey, particularly in countries like Kenya, sets the stage for the...

Full description

Saved in:
Bibliographic Details
Main Author: Mohd Rahiman, Nurulizzah
Format: Thesis
Language:English
Published: 2024
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/96468/1/96468.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This research delves into the widespread issue of domestic violence, emphasizing its severe impact on individuals and society globally. The surge in domestic violence during the COVID-19 pandemic, as highlighted by UN Women's survey, particularly in countries like Kenya, sets the stage for the research problem. Recognizing the lack of public awareness and understanding of attitudes towards domestic violence, the study proposes using sentiment analysis on Twitter data to monitor real-time public sentiment. The research objectives focus on studying and applying the Naive Bayes algorithm for sentiment analysis on tweets related to domestic violence, aiming to provide insights for researchers, government agencies, policymakers, and the public and develop a prediction model using Naive Bayes algorithm to evaluate its performance. The scope involves using English language tweets collected from March 2021 to November 2023, limiting the data to the topic of domestic violence. Few Naive Bayes classifiers are used to compare the accuracy of the Naive Bayes algorithm and parameter tuning also done on the classifiers. Resampling is used to handle the imbalance dataset. This research also compares using VADER and SentiWordNet lexicon to compare which has the best accuracy. The evaluation of algorithms consists of comparing the accuracy, specificity, and other evaluation metrics. Based on the results, Bernoulli classifier has the best accuracy of 94% while Multinomial has an accuracy of 93%. The best ratio of data to be used are 80:20 with VADER lexicon approach.