Android mobile malware detection model based on permission features using machine learning approach

The use of Android mobile devices has increased exponentially and gained massive popularity in the mobile market. It has become the most valuable item to humans across the world. The popularity and primary operating system of the Android mobile device have raised concerns over malware threats. Unscr...

Full description

Saved in:
Bibliographic Details
Main Author: Sharfah Ratibah, Tuan Mat
Format: Thesis
Language:English
Published: 2022
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/37673/1/ir.Android%20mobile%20malware%20detection%20model%20based%20on%20permission%20features%20using%20machine%20learning%20approach.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The use of Android mobile devices has increased exponentially and gained massive popularity in the mobile market. It has become the most valuable item to humans across the world. The popularity and primary operating system of the Android mobile device have raised concerns over malware threats. Unscrupulous authors have deployed malicious software such as root exploit, botnet, Trojan horse, and spyware and published it on Google Play to gain profits. Android malware has the ability to abduct user credentials and cause a resource to maltreat. Different techniques have been adopted to detect and prevent the spread of Android malware, including anomaly, signature-based, and hybrid detection techniques. Nevertheless, current technologies indicate that Android malware attackers have find novel ways to avoid detection. This study aims to propose an Android malware detection model using Bayesian classifier and Multilayer perceptron classifier via static analysis technique to address the Android malware issue. This study focused on the permission feature of Android mobile devices. This study obtained two types of datasets which were retrieved from Androzoo and Drebin database. The first dataset contains 10,000 samples, and the second dataset contains 96,074 samples. Several experiments were conducted to learn the permission features’ behaviour and find the best accuracy for the approaches used. Chi-square and information gain algorithms were used for features selection. The aim is to learn the behaviour of permission features that react to the accuracy according to the number of features. Both samples of datasets then were evaluated using machine learning and deep learning approaches to analyse the best accuracy of malware detection. The validation of machine learning obtained 85.4% accuracy for 96,074 samples and 91.1% accuracy for 10,000 samples. The validation in deep learning obtained 98.02% accuracy for the 96,074 samples and 98% accuracy for the 10,000 samples. These best achievements for both datasets were from the deep learning approach. In conclusion, the accuracy of deep learning is always greater in smaller or larger datasets, and machine learning produces great detection in smaller datasets.