Short Text Classification Using An Enhanced Term Weighting Scheme And Filter-Wrapper Feature Selection
Social networks and their usage in everyday life have caused an explosion in the amount of short electronic documents. Social networks, such as Twitter, are common mechanisms through which people can share information. The utilization of data that are available through social media for many applicat...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://eprints.usm.my/46679/1/short%20text%20classification%20using%20an%20enhanced%20term%20weghiting%20scheme%20and%20filter-wrapper%20feture%20selection24.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-usm-ep.46679 |
---|---|
record_format |
uketd_dc |
spelling |
my-usm-ep.466792020-07-03T07:41:20Z Short Text Classification Using An Enhanced Term Weighting Scheme And Filter-Wrapper Feature Selection 2018-12 Alsmadi, Issa Mohammad Ibrahim QA75.5-76.95 Electronic computers. Computer science Social networks and their usage in everyday life have caused an explosion in the amount of short electronic documents. Social networks, such as Twitter, are common mechanisms through which people can share information. The utilization of data that are available through social media for many applications is gradually increasing. Redundancy and noise in short texts are common problems in social media and in different applications that use short text. However, the shortness and high sparsity of short text lead to poor classification performance. Employing a powerful short-text classification method significantly affects many applications in terms of efficiency enhancement. This research aims to investigate and develop solutions for feature discrimination and selection in short texts classification. For feature discrimination, we introduce a term weighting approach namely, simple supervised weight (SW), which considers the special nature of short text in terms of term strength and distribution. To address the drawbacks of using existing feature selection with short text, this thesis proposes a filter-wrapper feature selection approach. In the first stage, we propose an adaptive filter-based feature selection method that is derived from the odd ratio method, used in reducing the dimensionality of feature space. In the second stage, grey wolf optimization (GWO) algorithm, a new heuristic search algorithm, uses the SVM accuracy as a fitness function to find the optimal subset feature. 2018-12 Thesis http://eprints.usm.my/46679/ http://eprints.usm.my/46679/1/short%20text%20classification%20using%20an%20enhanced%20term%20weghiting%20scheme%20and%20filter-wrapper%20feture%20selection24.pdf application/pdf en public phd doctoral Universiti Sains Malaysia Pusat Pengajian Sains Komputer |
institution |
Universiti Sains Malaysia |
collection |
USM Institutional Repository |
language |
English |
topic |
QA75.5-76.95 Electronic computers Computer science |
spellingShingle |
QA75.5-76.95 Electronic computers Computer science Alsmadi, Issa Mohammad Ibrahim Short Text Classification Using An Enhanced Term Weighting Scheme And Filter-Wrapper Feature Selection |
description |
Social networks and their usage in everyday life have caused an explosion in the amount of short electronic documents. Social networks, such as Twitter, are common mechanisms through which people can share information. The utilization of data that are available through social media for many applications is gradually increasing. Redundancy and noise in short texts are common problems in social media and in different applications that use short text. However, the shortness and high sparsity of short text lead to poor classification performance. Employing a powerful short-text classification method significantly affects many applications in terms of efficiency enhancement. This research aims to investigate and develop solutions for feature discrimination and selection in short texts classification. For feature discrimination, we introduce a term weighting approach namely, simple supervised weight (SW), which considers the special nature of short text in terms of term strength and distribution. To address the drawbacks of using existing feature selection with short text, this thesis proposes a filter-wrapper feature selection approach. In the first stage, we propose an adaptive filter-based feature selection method that is derived from the odd ratio method, used in reducing the dimensionality of feature space. In the second stage, grey wolf optimization (GWO) algorithm, a new heuristic search algorithm, uses the SVM accuracy as a fitness function to find the optimal subset feature. |
format |
Thesis |
qualification_name |
Doctor of Philosophy (PhD.) |
qualification_level |
Doctorate |
author |
Alsmadi, Issa Mohammad Ibrahim |
author_facet |
Alsmadi, Issa Mohammad Ibrahim |
author_sort |
Alsmadi, Issa Mohammad Ibrahim |
title |
Short Text Classification Using An Enhanced Term Weighting Scheme And Filter-Wrapper Feature Selection |
title_short |
Short Text Classification Using An Enhanced Term Weighting Scheme And Filter-Wrapper Feature Selection |
title_full |
Short Text Classification Using An Enhanced Term Weighting Scheme And Filter-Wrapper Feature Selection |
title_fullStr |
Short Text Classification Using An Enhanced Term Weighting Scheme And Filter-Wrapper Feature Selection |
title_full_unstemmed |
Short Text Classification Using An Enhanced Term Weighting Scheme And Filter-Wrapper Feature Selection |
title_sort |
short text classification using an enhanced term weighting scheme and filter-wrapper feature selection |
granting_institution |
Universiti Sains Malaysia |
granting_department |
Pusat Pengajian Sains Komputer |
publishDate |
2018 |
url |
http://eprints.usm.my/46679/1/short%20text%20classification%20using%20an%20enhanced%20term%20weghiting%20scheme%20and%20filter-wrapper%20feture%20selection24.pdf |
_version_ |
1747821711103885312 |