A malicious URL detection framework using priority coefficient and feature evaluation

Malicious Uniform Resource Locators (URLs) are one of the major threats in cybersecurity. Cyber attackers spread malicious URLs to carry out attacks such as phishing and malware, which lead unsuspecting visitors into scams, resulting in monetary loss, information theft, and other threats to website...

Full description

Saved in:
Bibliographic Details
Main Author: Rafsanjani, Ahmad Sahban
Format: Thesis
Language:English
Published: 2023
Subjects:
Online Access:http://eprints.utm.my/102826/1/AhmadSahbanRafsanjaniPRAZAK2023.pdf.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Malicious Uniform Resource Locators (URLs) are one of the major threats in cybersecurity. Cyber attackers spread malicious URLs to carry out attacks such as phishing and malware, which lead unsuspecting visitors into scams, resulting in monetary loss, information theft, and other threats to website users. At present, malicious URLs are detected using blacklist and heuristic methods, but these methods lack the ability to detect new and obfuscated URLs. Machine learning and deep learning methods have been seen as popular methods for improving the previous method to detect malicious URLs. However, these methods are entirely datadependent, and a large, updated dataset is necessary for the training to create an effective detection method. Besides, accuracy and detection mostly depend on the quality of training data. This research developed a framework to detect malicious URL based on predefined static feature classification by allocating priority coefficients and feature evaluation methods. The feature classification employed 39 classes of blacklist, lexical, host- based, and content-based features. A dataset containing 2000 real-world URLs was gathered from two popular phishing and malware websites, URLhaus and PhishTank. In the experiment, the proposed framework was evaluated with three supervised machine learning methods: Support Vector Machine (SVM), Random Forest (RF), and Bayesian Network (BN). The result showed that the proposed framework outperformed these methods. In addition, the proposed framework was benchmarked with three comprehensive malicious URL detection methods, which were Precise Phishing Detection with Recurrent Convolutional Neural Networks, Li, and URLNet in terms of accuracy and precision. The results showed that the proposed framework achieved a detection accuracy of 98.95% and a precision value of 98.60%. In sum, the developed malicious URL framework significantly improves the detection in terms of accuracy.