Increasing data anonymity using privacy techniques and advanced encryption standard

Data security and data privacy have been an important area in recent years. Dataset often consists of sensitive data fields, exposure of which may jeopardize individuals associated with the data. With the technological advancement, the attackers have been developing new methods from time to time to...

Full description

Saved in:
Bibliographic Details
Main Author: Thamer, Khalil Esmeel
Format: Thesis
Language:English
Published: 2020
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/34420/1/Increasing%20data%20anonymity%20using%20privacy%20techniques.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-ump-ir.34420
record_format uketd_dc
spelling my-ump-ir.344202022-06-16T02:34:11Z Increasing data anonymity using privacy techniques and advanced encryption standard 2020-09 Thamer, Khalil Esmeel QA75 Electronic computers. Computer science Data security and data privacy have been an important area in recent years. Dataset often consists of sensitive data fields, exposure of which may jeopardize individuals associated with the data. With the technological advancement, the attackers have been developing new methods from time to time to gain access to sensitive information from the bank, national and voter identification cards, etc. To resolve this issue, privacy techniques hinder the identification of the person through an increase of the anonymity of the individual in the dataset to protect sensitive information. This thesis addresses the problem of protecting sensitive user data in the dataset from privacy and security violations and malicious activities by attackers of the system using the integrated technique between privacy techniques with a security technique Advanced Encryption Standard (AES). The research objectives are to design an integrated technique between privacy techniques with a security technique for protecting data in the dataset. To implement the integrated technique between privacy techniques with security techniques to protect sensitive user data in the dataset. To validate the integrated technique by comparing the accuracy results before and after applying privacy techniques in the dataset using the data mining techniques. The research methodology consists of three phases. The first phase constitutes determining the suitable dataset and investigating the techniques and understanding of techniques and frameworks related to security and privacy. The second phase involves implementing the integrated technique of privacy techniques with AES. The third phase involves evaluating the classification results after experimenting with the implementation of the processes of the proposed technique. In this research, the security technique that was used is AES, and privacy techniques were differential privacy, k-Anonymity, Sample-uniqueness; and the data mining techniques: Naive Bayes, J48 and Neural Network were used under Weka. Many tests were conducted using Weka Experimenter Environment to classify the data to check the usefulness of the data for analysis purposes. Classification results before and after using privacy techniques were compared. It was found that differential privacy on the gender column of the dataset provides the best results in accuracy and outperforms the k-Anonymity technique and sample-uniqueness technique. However, kAnonymity and sample-uniqueness techniques on the age and cholesterol columns show better results in accuracy than the differential privacy technique. Besides, it was observed that for classification accuracy on Weka Experimenter Environment, Naïve Bayes presented better results than Neural Network and J48. To avert the dataset from external attackers, AES was used to encrypt the dataset passed through the privacy technique. To increase the safety, the encrypted file the general key involved with the encryption was split into five files and then each subfile and the subkey were attached and stores in five servers. The test results demonstrate the success of using the integrated technique using privacy techniques with the security technique to protect sensitive user data in the dataset. The future work can be focused on enhancing the classification accuracy of the protection schemes by examining more privacy techniques and applying them to big data and carrying out experiments on the big data in the cloud. 2020-09 Thesis http://umpir.ump.edu.my/id/eprint/34420/ http://umpir.ump.edu.my/id/eprint/34420/1/Increasing%20data%20anonymity%20using%20privacy%20techniques.pdf pdf en public masters Universiti Malaysia Pahang Faculty of Computing
institution Universiti Malaysia Pahang Al-Sultan Abdullah
collection UMPSA Institutional Repository
language English
topic QA75 Electronic computers
Computer science
spellingShingle QA75 Electronic computers
Computer science
Thamer, Khalil Esmeel
Increasing data anonymity using privacy techniques and advanced encryption standard
description Data security and data privacy have been an important area in recent years. Dataset often consists of sensitive data fields, exposure of which may jeopardize individuals associated with the data. With the technological advancement, the attackers have been developing new methods from time to time to gain access to sensitive information from the bank, national and voter identification cards, etc. To resolve this issue, privacy techniques hinder the identification of the person through an increase of the anonymity of the individual in the dataset to protect sensitive information. This thesis addresses the problem of protecting sensitive user data in the dataset from privacy and security violations and malicious activities by attackers of the system using the integrated technique between privacy techniques with a security technique Advanced Encryption Standard (AES). The research objectives are to design an integrated technique between privacy techniques with a security technique for protecting data in the dataset. To implement the integrated technique between privacy techniques with security techniques to protect sensitive user data in the dataset. To validate the integrated technique by comparing the accuracy results before and after applying privacy techniques in the dataset using the data mining techniques. The research methodology consists of three phases. The first phase constitutes determining the suitable dataset and investigating the techniques and understanding of techniques and frameworks related to security and privacy. The second phase involves implementing the integrated technique of privacy techniques with AES. The third phase involves evaluating the classification results after experimenting with the implementation of the processes of the proposed technique. In this research, the security technique that was used is AES, and privacy techniques were differential privacy, k-Anonymity, Sample-uniqueness; and the data mining techniques: Naive Bayes, J48 and Neural Network were used under Weka. Many tests were conducted using Weka Experimenter Environment to classify the data to check the usefulness of the data for analysis purposes. Classification results before and after using privacy techniques were compared. It was found that differential privacy on the gender column of the dataset provides the best results in accuracy and outperforms the k-Anonymity technique and sample-uniqueness technique. However, kAnonymity and sample-uniqueness techniques on the age and cholesterol columns show better results in accuracy than the differential privacy technique. Besides, it was observed that for classification accuracy on Weka Experimenter Environment, Naïve Bayes presented better results than Neural Network and J48. To avert the dataset from external attackers, AES was used to encrypt the dataset passed through the privacy technique. To increase the safety, the encrypted file the general key involved with the encryption was split into five files and then each subfile and the subkey were attached and stores in five servers. The test results demonstrate the success of using the integrated technique using privacy techniques with the security technique to protect sensitive user data in the dataset. The future work can be focused on enhancing the classification accuracy of the protection schemes by examining more privacy techniques and applying them to big data and carrying out experiments on the big data in the cloud.
format Thesis
qualification_level Master's degree
author Thamer, Khalil Esmeel
author_facet Thamer, Khalil Esmeel
author_sort Thamer, Khalil Esmeel
title Increasing data anonymity using privacy techniques and advanced encryption standard
title_short Increasing data anonymity using privacy techniques and advanced encryption standard
title_full Increasing data anonymity using privacy techniques and advanced encryption standard
title_fullStr Increasing data anonymity using privacy techniques and advanced encryption standard
title_full_unstemmed Increasing data anonymity using privacy techniques and advanced encryption standard
title_sort increasing data anonymity using privacy techniques and advanced encryption standard
granting_institution Universiti Malaysia Pahang
granting_department Faculty of Computing
publishDate 2020
url http://umpir.ump.edu.my/id/eprint/34420/1/Increasing%20data%20anonymity%20using%20privacy%20techniques.pdf
_version_ 1783732190151966720