A Machine Learning Classification Approach To Detect Tls-Based Malware Using Entropy-Based Flow Set Features

As internet encryption has grown to safeguard users’ privacy, malware has evolved to leverage encryption protocols such as Transport Layer Security (TLS) to conceal its hazardous connections. The difficulty and impracticality of decrypting TLS network traffic before it reaches the Intrusion Detectio...

Full description

Saved in:
Bibliographic Details
Main Author: Keshkeh, Kinan
Format: Thesis
Language:English
Published: 2022
Subjects:
Online Access:http://eprints.usm.my/60044/1/24%20Pages%20from%20KINAN%20KESHKEH.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:As internet encryption has grown to safeguard users’ privacy, malware has evolved to leverage encryption protocols such as Transport Layer Security (TLS) to conceal its hazardous connections. The difficulty and impracticality of decrypting TLS network traffic before it reaches the Intrusion Detection System (IDS) has driven numerous research studies to focus on anomaly-based malware detection without decryption employing various features and Machine Learning (ML) algorithms. Nonetheless, several of these studies used flow features with low feature importance value and poor capability to distinguish malicious flows, such as the number of packets sent and received in a flow or its duration. Furthermore, the outliers and frequency-based flow feature transformations (FTT) applied to mitigate the poor flow feature have several flaws. This thesis proposes a TLS-based malware detection (TLSMalDetect) approach based on ML classification to address flow feature utilization limitations in related work. TLSMalDetect includes periodicity-independent entropy-based flow set (EFS) features produced by an FFT technique. The efficiency of EFS features is assessed in two ways: (1) by comparing them to the relevant related work’s features of outliers and flow using four feature importance methods, and (2) by analyzing the classification performance in the scenarios with and without EFS features. This study also investigates TLSMalDetect detection performance using seven ML classification algorithms and identifies the one with the highest accuracy.