Detrecting malicious PDF document using support vector machine supervised learning algorithm

Malicious PDF files remain a real threat, in cyber world. In practice, it can affect badly masses of computer users, even after several high-profile security incidents. In spite of a series of a security patches issued by Adobe and other vendors, many users still have vulnerable client software inst...

Full description

Saved in:
Bibliographic Details
Main Author: Dabiranzohouri, Miranda
Format: Thesis
Published: 2014
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Malicious PDF files remain a real threat, in cyber world. In practice, it can affect badly masses of computer users, even after several high-profile security incidents. In spite of a series of a security patches issued by Adobe and other vendors, many users still have vulnerable client software installed on their computers. The expressiveness of the PDF format, furthermore, enables attackers to evade detection with little effort. Apart from traditional antivirus products, which are always a step behind attackers, few methods are known that can be deployed for protection of end-user systems. This thesis proposes a machine learning based method for detecting of malicious PDF documents which, instead of analyzing JavaScript or any other content, makes use of essential differences in the structural properties of malicious and benign PDF files. Support Vector Machine is used in order to testify and recognize the benign and malicious PDF file. The collected dataset consists of 2190 instance which 404 of them are malicious and 1786 instance are benign. The experimental results shows that SVM gives better result in limited number of feature compared to MLP and BayesNet method.