Hybrid performance measures and mixed evaluation method for data classification problems

This study investigates two different issues of performance measure in data classification problem. First, this study examines the use of accuracy measure as a discriminator for building an optimized Prototype Selection (PS) algorithm. Second, this study evaluates the current evaluation practices fo...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Hossin, Mohammad
التنسيق:	أطروحة
اللغة:	English
منشور في:	2012
الموضوعات:	Computer algorithms Machine learning
الوصول للمادة أونلاين:	http://psasir.upm.edu.my/id/eprint/33140/1/FSKTM%202012%2022.pdf
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

id	my-upm-ir.33140
record_format	uketd_dc
spelling	my-upm-ir.331402024-09-04T03:11:18Z Hybrid performance measures and mixed evaluation method for data classification problems 2012-04 Hossin, Mohammad This study investigates two different issues of performance measure in data classification problem. First, this study examines the use of accuracy measure as a discriminator for building an optimized Prototype Selection (PS) algorithm. Second, this study evaluates the current evaluation practices for evaluating and comparing the two performance measures. From the literature, the use of accuracy could lead to the underperforming of the evaluation process due to less distinctive and less discriminable values, and also unable to perform optimally when confronted with imbalanced class problem. Interestingly, the accuracy measure is still widely used in evaluating data classification problem. On the evaluation analysis, many previous studies emphasize on the generalization ability in evaluating and comparing the performance measures. Only few efforts have been dedicated to evaluate and compare the performance measures using different performance characteristics. In fact, no previous studies employ mixed evaluation method in evaluating and comparing the performance measures. For tackling the first issue, this study has successfully proposed several hybrid measures through the combination of accuracy with precision and recall measures. These hybrid measures are known as Optimized Accuracy with Conventional Recall-Precision (OACRP) and Optimized Accuracy with Extended Recall-Precision version 1 and version 2 (OAERP1 and OAERP2). More importantly, the OAERP1 and OAERP2 measure have been extended for evaluating multi-class problem. For the second issue, this study has proposed mixed evaluation method to evaluate the performance of two performance measures through different performance characteristics. For a systematic analysis, the mixed evaluation method is implemented into two stages. First, the hybrid measures are compared and analyzed against the accuracy measure based on their produced-values through different classification problems with different class distribution problems. Second, the hybrid measures are compared and analyzed empirically against the accuracy measure and other selected performance measures based on generalization ability using three selected PS algorithms (MCS, LVQ21 and GA) and large benchmark datasets. In the first evaluation stage, the OAERP2 measure has shown better produced-value against accuracy, OACRP and OAERP1 measures in terms of distinctiveness,discriminability, informativeness, favors towards minority class, and degree of consistency and discriminatory. In the second evaluation stage, almost all selected algorithms that optimized by OAERP2 measure are able to produce better generalization ability against its original measure and other selected performance measures. Moreover, the GA model that was optimized by OAERP2 measure (GAoe2) performed significantly and statistically differently as compared to other OAERP2-based models through win-draw-loss evaluation method and two nonparametric tests. Interestingly, the GAoe2 model also performed significantly and statistically differently as compared to nine additional PS algorithms in terms of testing error and storage requirements. From all evaluations, it clearly reveals that the OAERP2 measure is able to choose a better solution during the classification training. As a result, it leads towards a better trained PS classifier with better generalization ability. On the other hand, the mixed evaluation method has enabled this study to evaluate and compare the studied performance measures systematically and comprehensively via different performance characteristics. Computer algorithms Machine learning 2012-04 Thesis http://psasir.upm.edu.my/id/eprint/33140/ http://psasir.upm.edu.my/id/eprint/33140/1/FSKTM%202012%2022.pdf text en public doctoral Universiti Putra Malaysia Computer algorithms Machine learning Sulaiman, Md. Nasir
institution	Universiti Putra Malaysia
collection	PSAS Institutional Repository
language	English
advisor	Sulaiman, Md. Nasir
topic	Computer algorithms Machine learning
spellingShingle	Computer algorithms Machine learning Hossin, Mohammad Hybrid performance measures and mixed evaluation method for data classification problems
description	This study investigates two different issues of performance measure in data classification problem. First, this study examines the use of accuracy measure as a discriminator for building an optimized Prototype Selection (PS) algorithm. Second, this study evaluates the current evaluation practices for evaluating and comparing the two performance measures. From the literature, the use of accuracy could lead to the underperforming of the evaluation process due to less distinctive and less discriminable values, and also unable to perform optimally when confronted with imbalanced class problem. Interestingly, the accuracy measure is still widely used in evaluating data classification problem. On the evaluation analysis, many previous studies emphasize on the generalization ability in evaluating and comparing the performance measures. Only few efforts have been dedicated to evaluate and compare the performance measures using different performance characteristics. In fact, no previous studies employ mixed evaluation method in evaluating and comparing the performance measures. For tackling the first issue, this study has successfully proposed several hybrid measures through the combination of accuracy with precision and recall measures. These hybrid measures are known as Optimized Accuracy with Conventional Recall-Precision (OACRP) and Optimized Accuracy with Extended Recall-Precision version 1 and version 2 (OAERP1 and OAERP2). More importantly, the OAERP1 and OAERP2 measure have been extended for evaluating multi-class problem. For the second issue, this study has proposed mixed evaluation method to evaluate the performance of two performance measures through different performance characteristics. For a systematic analysis, the mixed evaluation method is implemented into two stages. First, the hybrid measures are compared and analyzed against the accuracy measure based on their produced-values through different classification problems with different class distribution problems. Second, the hybrid measures are compared and analyzed empirically against the accuracy measure and other selected performance measures based on generalization ability using three selected PS algorithms (MCS, LVQ21 and GA) and large benchmark datasets. In the first evaluation stage, the OAERP2 measure has shown better produced-value against accuracy, OACRP and OAERP1 measures in terms of distinctiveness,discriminability, informativeness, favors towards minority class, and degree of consistency and discriminatory. In the second evaluation stage, almost all selected algorithms that optimized by OAERP2 measure are able to produce better generalization ability against its original measure and other selected performance measures. Moreover, the GA model that was optimized by OAERP2 measure (GAoe2) performed significantly and statistically differently as compared to other OAERP2-based models through win-draw-loss evaluation method and two nonparametric tests. Interestingly, the GAoe2 model also performed significantly and statistically differently as compared to nine additional PS algorithms in terms of testing error and storage requirements. From all evaluations, it clearly reveals that the OAERP2 measure is able to choose a better solution during the classification training. As a result, it leads towards a better trained PS classifier with better generalization ability. On the other hand, the mixed evaluation method has enabled this study to evaluate and compare the studied performance measures systematically and comprehensively via different performance characteristics.
format	Thesis
qualification_level	Doctorate
author	Hossin, Mohammad
author_facet	Hossin, Mohammad
author_sort	Hossin, Mohammad
title	Hybrid performance measures and mixed evaluation method for data classification problems
title_short	Hybrid performance measures and mixed evaluation method for data classification problems
title_full	Hybrid performance measures and mixed evaluation method for data classification problems
title_fullStr	Hybrid performance measures and mixed evaluation method for data classification problems
title_full_unstemmed	Hybrid performance measures and mixed evaluation method for data classification problems
title_sort	hybrid performance measures and mixed evaluation method for data classification problems
granting_institution	Universiti Putra Malaysia
publishDate	2012
url	http://psasir.upm.edu.my/id/eprint/33140/1/FSKTM%202012%2022.pdf
_version_	1811767720483487744

Hybrid performance measures and mixed evaluation method for data classification problems

مواد مشابهة