New Learning Models for Generating Classification Rules Based on Rough Set Approach

Data sets, static or dynamic, are very important and useful for presenting real life features in different aspects of industry, medicine, economy, and others. Recently, different models were used to generate knowledge from vague and uncertain data sets such as induction decision tree, neural netw...

Full description

Saved in:
Bibliographic Details
Main Author: Al Shalabi, Luai Abdel Lateef
Format: Thesis
Language:English
English
Published: 2000
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/9646/1/FSKTM_2000_2_IR.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-upm-ir.9646
record_format uketd_dc
spelling my-upm-ir.96462023-11-29T02:15:11Z New Learning Models for Generating Classification Rules Based on Rough Set Approach 2000-09 Al Shalabi, Luai Abdel Lateef Data sets, static or dynamic, are very important and useful for presenting real life features in different aspects of industry, medicine, economy, and others. Recently, different models were used to generate knowledge from vague and uncertain data sets such as induction decision tree, neural network, fuzzy logic, genetic algorithm, rough set theory, and others. All of these models take long time to learn for a huge and dynamic data set. Thus, the challenge is how to develop an efficient model that can decrease the learning time without affecting the quality of the generated classification rules. Huge information systems or data sets usually have some missing values due to unavailable data that affect the quality of the generated classification rules. Missing values lead to the difficulty of extracting useful information from that data set. Another challenge is how to solve the problem of missing data. Rough set theory is a new mathematical tool to deal with vagueness and uncertainty. It is a useful approach for uncovering classificatory knowledge and building a classification rules. So, the application of the theory as part of the learning models was proposed in this thesis. Two different models for learning in data sets were proposed based on two different reduction algorithms. The split-condition-merge-reduct algorithm ( SCMR) was performed on three different modules: partitioning the data set vertically into subsets, applying rough set concepts of reduction to each subset, and merging the reducts of all subsets to form the best reduct. The enhanced-split-condition-merge-reduct algorithm (E SCMR) was performed on the above three modules followed by another module that applies the rough set reduction concept again to the reduct generated by SCMR in order to generate the best reduct, which plays the same role as if all attributes in this subset existed. Classification rules were generated based on the best reduct. For the problem of missing data, a new approach was proposed based on data partitioning and function mode. In this new approach, the data set was partitioned horizontally into different subsets. All objects in each subset of data were described by only one classification value. The mode function was applied to each subset of data that has missing values in order to find the most frequently occurring value in each attribute. Missing values in that attribute were replaced by the mode value. The proposed approach for missing values produced better results compared to other approaches. Also, the proposed models for learning in data sets generated the classification rules faster than other methods. The accuracy of the classification rules by the proposed models was high compared to other models. Classification rules - Methodology 2000-09 Thesis http://psasir.upm.edu.my/id/eprint/9646/ http://psasir.upm.edu.my/id/eprint/9646/1/FSKTM_2000_2_IR.pdf text en public doctoral Universiti Putra Malaysia Classification rules - Methodology Faculty of Computer Science and Information Technology Mahmod, Ramlan English
institution Universiti Putra Malaysia
collection PSAS Institutional Repository
language English
English
advisor Mahmod, Ramlan
topic Classification rules - Methodology


spellingShingle Classification rules - Methodology


Al Shalabi, Luai Abdel Lateef
New Learning Models for Generating Classification Rules Based on Rough Set Approach
description Data sets, static or dynamic, are very important and useful for presenting real life features in different aspects of industry, medicine, economy, and others. Recently, different models were used to generate knowledge from vague and uncertain data sets such as induction decision tree, neural network, fuzzy logic, genetic algorithm, rough set theory, and others. All of these models take long time to learn for a huge and dynamic data set. Thus, the challenge is how to develop an efficient model that can decrease the learning time without affecting the quality of the generated classification rules. Huge information systems or data sets usually have some missing values due to unavailable data that affect the quality of the generated classification rules. Missing values lead to the difficulty of extracting useful information from that data set. Another challenge is how to solve the problem of missing data. Rough set theory is a new mathematical tool to deal with vagueness and uncertainty. It is a useful approach for uncovering classificatory knowledge and building a classification rules. So, the application of the theory as part of the learning models was proposed in this thesis. Two different models for learning in data sets were proposed based on two different reduction algorithms. The split-condition-merge-reduct algorithm ( SCMR) was performed on three different modules: partitioning the data set vertically into subsets, applying rough set concepts of reduction to each subset, and merging the reducts of all subsets to form the best reduct. The enhanced-split-condition-merge-reduct algorithm (E SCMR) was performed on the above three modules followed by another module that applies the rough set reduction concept again to the reduct generated by SCMR in order to generate the best reduct, which plays the same role as if all attributes in this subset existed. Classification rules were generated based on the best reduct. For the problem of missing data, a new approach was proposed based on data partitioning and function mode. In this new approach, the data set was partitioned horizontally into different subsets. All objects in each subset of data were described by only one classification value. The mode function was applied to each subset of data that has missing values in order to find the most frequently occurring value in each attribute. Missing values in that attribute were replaced by the mode value. The proposed approach for missing values produced better results compared to other approaches. Also, the proposed models for learning in data sets generated the classification rules faster than other methods. The accuracy of the classification rules by the proposed models was high compared to other models.
format Thesis
qualification_level Doctorate
author Al Shalabi, Luai Abdel Lateef
author_facet Al Shalabi, Luai Abdel Lateef
author_sort Al Shalabi, Luai Abdel Lateef
title New Learning Models for Generating Classification Rules Based on Rough Set Approach
title_short New Learning Models for Generating Classification Rules Based on Rough Set Approach
title_full New Learning Models for Generating Classification Rules Based on Rough Set Approach
title_fullStr New Learning Models for Generating Classification Rules Based on Rough Set Approach
title_full_unstemmed New Learning Models for Generating Classification Rules Based on Rough Set Approach
title_sort new learning models for generating classification rules based on rough set approach
granting_institution Universiti Putra Malaysia
granting_department Faculty of Computer Science and Information Technology
publishDate 2000
url http://psasir.upm.edu.my/id/eprint/9646/1/FSKTM_2000_2_IR.pdf
_version_ 1794018872570937344