A Machine Learning Classification Approach To Detect Tls-Based Malware Using Entropy-Based Flow Set Features

As internet encryption has grown to safeguard users’ privacy, malware has evolved to leverage encryption protocols such as Transport Layer Security (TLS) to conceal its hazardous connections. The difficulty and impracticality of decrypting TLS network traffic before it reaches the Intrusion Detectio...

全面介绍

Saved in:
书目详细资料
主要作者: Keshkeh, Kinan
格式: Thesis
语言:English
出版: 2022
主题:
在线阅读:http://eprints.usm.my/60044/1/24%20Pages%20from%20KINAN%20KESHKEH.pdf
标签: 添加标签
没有标签, 成为第一个标记此记录!
实物特征
总结:As internet encryption has grown to safeguard users’ privacy, malware has evolved to leverage encryption protocols such as Transport Layer Security (TLS) to conceal its hazardous connections. The difficulty and impracticality of decrypting TLS network traffic before it reaches the Intrusion Detection System (IDS) has driven numerous research studies to focus on anomaly-based malware detection without decryption employing various features and Machine Learning (ML) algorithms. Nonetheless, several of these studies used flow features with low feature importance value and poor capability to distinguish malicious flows, such as the number of packets sent and received in a flow or its duration. Furthermore, the outliers and frequency-based flow feature transformations (FTT) applied to mitigate the poor flow feature have several flaws. This thesis proposes a TLS-based malware detection (TLSMalDetect) approach based on ML classification to address flow feature utilization limitations in related work. TLSMalDetect includes periodicity-independent entropy-based flow set (EFS) features produced by an FFT technique. The efficiency of EFS features is assessed in two ways: (1) by comparing them to the relevant related work’s features of outliers and flow using four feature importance methods, and (2) by analyzing the classification performance in the scenarios with and without EFS features. This study also investigates TLSMalDetect detection performance using seven ML classification algorithms and identifies the one with the highest accuracy.