Privacy-Preserving Decision Tree Pruning In Network-Based Intrusion Detection System
Machine learning techniques have been extensively adopted in the domain of Network-based Intrusion Detection System (NIDS) especially in the task of network traffics classification. While having a precise classification model in separating the normal and malicious network traffics still remain as th...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Published: |
2019
|
Subjects: | |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Machine learning techniques have been extensively adopted in the domain of Network-based Intrusion Detection System (NIDS) especially in the task of network traffics classification. While having a precise classification model in separating the normal and malicious network traffics still remain as the ultimate goal, the privacy protection for network traffic database cannot be ignore as well. The common solution to tackle this matter is anonymising the database through the statistical approach. Anonymising can be referred to masking, hiding or removing certain sensitive information from the database. In the past decades, numerous anonymisation tools and techniques have been developed to conceal the sensitive information which could be revealed by the network data. The main usage of privacy solutions is to conceal the potentially sensitive information in the network traces. However, it is also important to ensure the anonymisation techniques are not severely deteriorating the performances of NIDS. Presently, the conventional way to gauge the usability of network data is by exploiting the number of alarms generated by Snort NIDS before-and-after an anonymisation solution. Nevertheless, this approach may not be feasible when considering the application of machine learning in segregating the traffics. In order to fill this gap, 10 notable machine classifiers are employed to evaluate the performances of 2 network data privacy solutions: (1) port number bilateral classification and (2) IP truncation. Utility of the network data is measured based on the classification accuracy attained. |
---|