Hybrid fuzzy techniques for unsupervised intrusion detection system

Network intrusion detection is a complex research problem especially when it deals with unknown patterns. Furthermore, if the amount of audit data instances is large, human labelling becomes tedious, time-consuming, and expensive. A technique which can enhance the learning capability of an anomaly i...

Full description

Saved in:
Bibliographic Details
Main Author: Chimphlee, Witcha
Format: Thesis
Language:English
Published: 2008
Subjects:
Online Access:http://eprints.utm.my/id/eprint/18722/1/WitchaChimphleePFSKSM2008.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Network intrusion detection is a complex research problem especially when it deals with unknown patterns. Furthermore, if the amount of audit data instances is large, human labelling becomes tedious, time-consuming, and expensive. A technique which can enhance the learning capability of an anomaly intrusion detection system is required. Unsupervised anomaly detection methods have been deployed to address the weaknesses of both signature-based and supervised anomaly detection. These methods take a set of unlabelled data as input, in which the majority of data set is normal traffic, and attempt to find intrusion hidden in the data. Although the unsupervised anomaly detection has received a lot of attention from many researchers, it still has many drawbacks which can be improved. This thesis proposes a framework which comprises three components: feature selection, new clustering and novel cluster labelling. The task of feature selection is to choose relevant feature which is obtained through statistical testing. The new clustering technique is called F2ART which is a hybrid of Fuzzy c-means and Fuzzy Adaptive Resonance Theory. It incorporates a modified similarity measure and a new learning rule which also includes a fuzzy membership value in improving the detection rate. Finally this thesis also proposes a new cluster labelling algorithm called Normal Membership Factor (NMF). This algorithm introduces weighting degree of probability of clusters, which can decrease false positive rate. Based on the experimental results that have been carried out using the KDD Cup 1999 data set, it indicates that the framework provides the best performance in terms of detection rate compared to the current unsupervised anomaly detection approaches. Unlike traditional anomaly detection methods that require 98 percent of the unlabelled data to be in normal pattern, this framework can still work with only 80 percent of the normal pattern. In addition, it can also improve the analysis of new data over time without the need to retrain over all the previous and new data