Support vector machine for solving small dataset problem

Data quantity is the main concern in the small data set problem, because usually insufficient data information will not lead to a robust classification performance. How to extract more effective information from a small data set is thus of considerable interest. A computational technique called Supp...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Abdul Rahman, Ahmad Rijal
التنسيق: أطروحة
اللغة:English
منشور في: 2012
الموضوعات:
الوصول للمادة أونلاين:http://eprints.utm.my/id/eprint/32547/1/AhmadRijalAbdulRahmanMFKE2012.pdf
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:Data quantity is the main concern in the small data set problem, because usually insufficient data information will not lead to a robust classification performance. How to extract more effective information from a small data set is thus of considerable interest. A computational technique called Support Vector Machine (SVM) constructs a hyperplane or set of hyperplanes in a high or infinite dimensional space, which can be used for classification, regression or other tasks, is proposed for this project. Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the nearest training data points of any class (so-called functional margin). In general, the larger the margin the lower the generalization error of the classifier is achieved. In this research, Support Vector Machine (SVM) is employed for solving small dataset problems in binary classification. A lot of performance measure can be used to measure the performance of data. This research used accuracy as a performance measure. In order to improve the performance of accuracy, SMOTE (Synthetic Minority Oversampling Technique) algorithm has been used to balance the data with creates a synthetic data in the minority class for imbalanced dataset or both of negative and positive class for balanced dataset problem. An algorithm of SVM and SMOTE has been developed using Matlab.