Incorporating Informative Score For Instance Selection In Semi-supervised Sentiment Classification

Sentiment classification is a useful tool to classify reviews that contain a wealth of information about sentiments and attitudes towards a product or service. Existing studies are heavily relying on sentiment classification methods that require fully annotated input. However, there are limited labe...

Full description

Saved in:
Bibliographic Details
Main Author: Vivian, Lee Lay Shan
Format: Thesis
Language:English
Published: 2022
Subjects:
Online Access:http://eprints.usm.my/60138/1/VIVIAN%20LEE%20LAY%20SHAN%20-%20TESIS24.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sentiment classification is a useful tool to classify reviews that contain a wealth of information about sentiments and attitudes towards a product or service. Existing studies are heavily relying on sentiment classification methods that require fully annotated input. However, there are limited labelled text available, making the acquirement process of the fully annotated input costly and labour intensive. In recent years, semi-supervised methods have been positively recommended as they require only partially labelled input and performed comparably to the current preferred methods. At the same time, there are some works reported the performance of semi-supervised model degraded after adding unlabelled instances into training. The contrast of the current literature shows that not all unlabelled instances are equally useful; thus identifying the informative unlabelled instances is beneficial in training a semi-supervised model. To achieve this, informative score is proposed and incorporated into semi-supervised sentiment classification. The experiment compared the accuracy and loss of supervised method, semi-supervised method without informative score and semi-supervised method with informative score. With the help of informative score to identify informative unlabelled instances, semi-supervised models can perform better compared to semi-supervised models that do not incorporate informative score into its training. Although performance of semi-supervised models incorporated with informative score are not able to surpass the supervised models, the results are still found promising as the differences in performance are subtle and the number of labelled instances used are greatly reduced.