Cyberbullying detection using emotion mining

The expansion of information and communication technologies (ICTs) has led to developments in online communication. Regrettably, such convenience has been abused by online bullies, causing harm to others via threatening, harassing, humiliating, intimidating, manipulating, or controlling targeted vic...

全面介绍

Saved in:

书目详细资料
主要作者:	Al-Hashedi, Mohammed Yahea Ali Mahyoub
格式:	Thesis
出版:	2022
主题:	BF1-990 Psychology
标签:	添加标签没有标签, 成为第一个标记此记录!

id	my-mmu-ep.12041
record_format	uketd_dc
spelling	my-mmu-ep.120412024-01-11T01:55:27Z Cyberbullying detection using emotion mining 2022-10 Al-Hashedi, Mohammed Yahea Ali Mahyoub BF1-990 Psychology The expansion of information and communication technologies (ICTs) has led to developments in online communication. Regrettably, such convenience has been abused by online bullies, causing harm to others via threatening, harassing, humiliating, intimidating, manipulating, or controlling targeted victims. Cyberbullying can have a severe impact on a victim’s mental health, ranging from negative emotions (anger, fear, sadness, guilt, etc.) to depression, and even suicidal thoughts. Due to the potential harmful consequences, cyberbullying detection has become a pressing need in Internet usage governance. The research presented in this thesis is motivated by the fact that negative emotions can be caused by cyberbullying and proposes cyberbullying detection models that are trained based on contextual, emotion, and sentiment features. In this work, all critical steps were taken into consideration, from data preparation to deep learning models. There is a sparsity issue in cyberbullying datasets that encompasses all forms of cyberbullying, such as threatening, harassing, humiliating, intimidating, and manipulating or controlling targeted victims. To address this issue, this research utilized two datasets: the Toxic dataset, collected by the Conversation AI team, and the Twitter dataset. The dataset of cyberbullying generally faces an imbalance between its labels; therefore, sampling techniques were developed to reduce the imbalance ratio. After the datasets preparation, the next step in detecting cyberbullying was extracting textual features, such as syntactic, semantic, contextual, and emotion features. Nevertheless, emotion features were thoroughly investigated through the use of a lexiconbased deep learning model. To build an emotion detection model, the used emotion datasets were collected from twitter through hashtag keywords, and were categorized based on these keywords. Due to the potential inaccuracy of the hashtag labelling, a validation procedure was then carried out to authenticate the annotation of the emotion dataset labels. The validated dataset was then used to train the emotion detection model (EDM) using BERT as a pre-trained word representation model. This model was used to study and explore the emotions related to cyberbullying texts. The results indicate that 92% of cyberbullying emotions are categorized as negative. Emotions and sentiments were drawn out from cyberbullying datasets through the use of EDM and NRC lexicon for emotions and AFINN lexicon for sentiments. These features were fed to deep learning models to train cyberbullying detection models. A set of experiments were carried out to investigate the best set of features for cyberbullying detection. The findings indicate that incorporating emotions features can enhance the precision of detecting cyberbullying as this approach outperformed the use of BERT contextual features only. In the experiment that involved emotion features, the recall score was 0.87, which led to a 0.5 increase in the performance of cyberbullying detection compared to using only BERT. Similarly, incorporating sentiment features improved the model by 0.6 recall compared to only utilizing BERT. 2022-10 Thesis http://shdl.mmu.edu.my/12041/ http://erep.mmu.edu.my/ masters Multimedia University Faculty of Computing and Informatics (FCI) EREP ID: 11743
institution	Multimedia University
collection	MMU Institutional Repository
topic	BF1-990 Psychology
spellingShingle	BF1-990 Psychology Al-Hashedi, Mohammed Yahea Ali Mahyoub Cyberbullying detection using emotion mining
description	The expansion of information and communication technologies (ICTs) has led to developments in online communication. Regrettably, such convenience has been abused by online bullies, causing harm to others via threatening, harassing, humiliating, intimidating, manipulating, or controlling targeted victims. Cyberbullying can have a severe impact on a victim’s mental health, ranging from negative emotions (anger, fear, sadness, guilt, etc.) to depression, and even suicidal thoughts. Due to the potential harmful consequences, cyberbullying detection has become a pressing need in Internet usage governance. The research presented in this thesis is motivated by the fact that negative emotions can be caused by cyberbullying and proposes cyberbullying detection models that are trained based on contextual, emotion, and sentiment features. In this work, all critical steps were taken into consideration, from data preparation to deep learning models. There is a sparsity issue in cyberbullying datasets that encompasses all forms of cyberbullying, such as threatening, harassing, humiliating, intimidating, and manipulating or controlling targeted victims. To address this issue, this research utilized two datasets: the Toxic dataset, collected by the Conversation AI team, and the Twitter dataset. The dataset of cyberbullying generally faces an imbalance between its labels; therefore, sampling techniques were developed to reduce the imbalance ratio. After the datasets preparation, the next step in detecting cyberbullying was extracting textual features, such as syntactic, semantic, contextual, and emotion features. Nevertheless, emotion features were thoroughly investigated through the use of a lexiconbased deep learning model. To build an emotion detection model, the used emotion datasets were collected from twitter through hashtag keywords, and were categorized based on these keywords. Due to the potential inaccuracy of the hashtag labelling, a validation procedure was then carried out to authenticate the annotation of the emotion dataset labels. The validated dataset was then used to train the emotion detection model (EDM) using BERT as a pre-trained word representation model. This model was used to study and explore the emotions related to cyberbullying texts. The results indicate that 92% of cyberbullying emotions are categorized as negative. Emotions and sentiments were drawn out from cyberbullying datasets through the use of EDM and NRC lexicon for emotions and AFINN lexicon for sentiments. These features were fed to deep learning models to train cyberbullying detection models. A set of experiments were carried out to investigate the best set of features for cyberbullying detection. The findings indicate that incorporating emotions features can enhance the precision of detecting cyberbullying as this approach outperformed the use of BERT contextual features only. In the experiment that involved emotion features, the recall score was 0.87, which led to a 0.5 increase in the performance of cyberbullying detection compared to using only BERT. Similarly, incorporating sentiment features improved the model by 0.6 recall compared to only utilizing BERT.
format	Thesis
qualification_level	Master's degree
author	Al-Hashedi, Mohammed Yahea Ali Mahyoub
author_facet	Al-Hashedi, Mohammed Yahea Ali Mahyoub
author_sort	Al-Hashedi, Mohammed Yahea Ali Mahyoub
title	Cyberbullying detection using emotion mining
title_short	Cyberbullying detection using emotion mining
title_full	Cyberbullying detection using emotion mining
title_fullStr	Cyberbullying detection using emotion mining
title_full_unstemmed	Cyberbullying detection using emotion mining
title_sort	cyberbullying detection using emotion mining
granting_institution	Multimedia University
granting_department	Faculty of Computing and Informatics (FCI)
publishDate	2022
_version_	1794019133318234112

Cyberbullying detection using emotion mining

相似书籍