New Learning Models for Generating Classification Rules Based on Rough Set Approach
Data sets, static or dynamic, are very important and useful for presenting real life features in different aspects of industry, medicine, economy, and others. Recently, different models were used to generate knowledge from vague and uncertain data sets such as induction decision tree, neural netw...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English |
Published: |
2000
|
Subjects: | |
Online Access: | http://psasir.upm.edu.my/id/eprint/9646/1/FSKTM_2000_2_IR.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-upm-ir.9646 |
---|---|
record_format |
uketd_dc |
spelling |
my-upm-ir.96462023-11-29T02:15:11Z New Learning Models for Generating Classification Rules Based on Rough Set Approach 2000-09 Al Shalabi, Luai Abdel Lateef Data sets, static or dynamic, are very important and useful for presenting real life features in different aspects of industry, medicine, economy, and others. Recently, different models were used to generate knowledge from vague and uncertain data sets such as induction decision tree, neural network, fuzzy logic, genetic algorithm, rough set theory, and others. All of these models take long time to learn for a huge and dynamic data set. Thus, the challenge is how to develop an efficient model that can decrease the learning time without affecting the quality of the generated classification rules. Huge information systems or data sets usually have some missing values due to unavailable data that affect the quality of the generated classification rules. Missing values lead to the difficulty of extracting useful information from that data set. Another challenge is how to solve the problem of missing data. Rough set theory is a new mathematical tool to deal with vagueness and uncertainty. It is a useful approach for uncovering classificatory knowledge and building a classification rules. So, the application of the theory as part of the learning models was proposed in this thesis. Two different models for learning in data sets were proposed based on two different reduction algorithms. The split-condition-merge-reduct algorithm ( SCMR) was performed on three different modules: partitioning the data set vertically into subsets, applying rough set concepts of reduction to each subset, and merging the reducts of all subsets to form the best reduct. The enhanced-split-condition-merge-reduct algorithm (E SCMR) was performed on the above three modules followed by another module that applies the rough set reduction concept again to the reduct generated by SCMR in order to generate the best reduct, which plays the same role as if all attributes in this subset existed. Classification rules were generated based on the best reduct. For the problem of missing data, a new approach was proposed based on data partitioning and function mode. In this new approach, the data set was partitioned horizontally into different subsets. All objects in each subset of data were described by only one classification value. The mode function was applied to each subset of data that has missing values in order to find the most frequently occurring value in each attribute. Missing values in that attribute were replaced by the mode value. The proposed approach for missing values produced better results compared to other approaches. Also, the proposed models for learning in data sets generated the classification rules faster than other methods. The accuracy of the classification rules by the proposed models was high compared to other models. Classification rules - Methodology 2000-09 Thesis http://psasir.upm.edu.my/id/eprint/9646/ http://psasir.upm.edu.my/id/eprint/9646/1/FSKTM_2000_2_IR.pdf text en public doctoral Universiti Putra Malaysia Classification rules - Methodology Faculty of Computer Science and Information Technology Mahmod, Ramlan English |
institution |
Universiti Putra Malaysia |
collection |
PSAS Institutional Repository |
language |
English English |
advisor |
Mahmod, Ramlan |
topic |
Classification rules - Methodology |
spellingShingle |
Classification rules - Methodology Al Shalabi, Luai Abdel Lateef New Learning Models for Generating Classification Rules Based on Rough Set Approach |
description |
Data sets, static or dynamic, are very important and useful for presenting real life
features in different aspects of industry, medicine, economy, and others. Recently,
different models were used to generate knowledge from vague and uncertain data
sets such as induction decision tree, neural network, fuzzy logic, genetic algorithm,
rough set theory, and others. All of these models take long time to learn for a huge
and dynamic data set. Thus, the challenge is how to develop an efficient model that
can decrease the learning time without affecting the quality of the generated
classification rules. Huge information systems or data sets usually have some
missing values due to unavailable data that affect the quality of the generated
classification rules. Missing values lead to the difficulty of extracting useful
information from that data set. Another challenge is how to solve the problem of
missing data. Rough set theory is a new mathematical tool to deal with vagueness and uncertainty.
It is a useful approach for uncovering classificatory knowledge and building a
classification rules. So, the application of the theory as part of the learning models
was proposed in this thesis.
Two different models for learning in data sets were proposed based on two different
reduction algorithms. The split-condition-merge-reduct algorithm ( SCMR) was
performed on three different modules: partitioning the data set vertically into subsets,
applying rough set concepts of reduction to each subset, and merging the reducts of
all subsets to form the best reduct. The enhanced-split-condition-merge-reduct
algorithm (E SCMR) was performed on the above three modules followed by another
module that applies the rough set reduction concept again to the reduct generated by
SCMR in order to generate the best reduct, which plays the same role as if all
attributes in this subset existed. Classification rules were generated based on the best
reduct.
For the problem of missing data, a new approach was proposed based on data
partitioning and function mode. In this new approach, the data set was partitioned
horizontally into different subsets. All objects in each subset of data were described
by only one classification value. The mode function was applied to each subset of
data that has missing values in order to find the most frequently occurring value in
each attribute. Missing values in that attribute were replaced by the mode value.
The proposed approach for missing values produced better results compared to other
approaches. Also, the proposed models for learning in data sets generated the classification rules faster than other methods. The accuracy of the classification rules
by the proposed models was high compared to other models. |
format |
Thesis |
qualification_level |
Doctorate |
author |
Al Shalabi, Luai Abdel Lateef |
author_facet |
Al Shalabi, Luai Abdel Lateef |
author_sort |
Al Shalabi, Luai Abdel Lateef |
title |
New Learning Models for Generating Classification Rules Based on Rough Set Approach |
title_short |
New Learning Models for Generating Classification Rules Based on Rough Set Approach |
title_full |
New Learning Models for Generating Classification Rules Based on Rough Set Approach |
title_fullStr |
New Learning Models for Generating Classification Rules Based on Rough Set Approach |
title_full_unstemmed |
New Learning Models for Generating Classification Rules Based on Rough Set Approach |
title_sort |
new learning models for generating classification rules based on rough set approach |
granting_institution |
Universiti Putra Malaysia |
granting_department |
Faculty of Computer Science and Information Technology |
publishDate |
2000 |
url |
http://psasir.upm.edu.my/id/eprint/9646/1/FSKTM_2000_2_IR.pdf |
_version_ |
1794018872570937344 |