An efficient relational to column oriented database schema transformation technique

NoSQL database is introduced to overcome the high demand of managing database management systems in addition to the need for managing huge amount of data in unstructured format. Thus, data migration has become an important process in database management to migrate relational database to NoSQL dat...

Full description

Saved in:
Bibliographic Details
Main Author: Zaidi, Norwini
Format: Thesis
Language:English
Published: 2019
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/90672/1/FSKTM%202020%201%20IR.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:NoSQL database is introduced to overcome the high demand of managing database management systems in addition to the need for managing huge amount of data in unstructured format. Thus, data migration has become an important process in database management to migrate relational database to NoSQL database due to the limitations in managing relational database. Schema transformation is an important process in data migration and there are various techniques that have been proposed to improve schema transformation and data migration from the relational database to the NoSQL database. The most common technique of schema transformation to NoSQL database is denormalization. However, schema transformation using denormalization suffers in terms of unnecessary data duplication in the NoSQL database that increases storage size. Furthermore, NoSQL database also has its limitations in terms of table joining and unable to perform queries on multiple tables. Schema transformation techniques using nested table merging describes only two related tables to merge. This inefficient schema transformation techniques lead to querying to be done on multiple tables and cause high query processing time. This research proposed a schema transformation technique for migrating data from relational database to column oriented database. The schema transformation technique has three main steps which are denormalization with read pattern, nested and multiple nested table merging, and rowkey design to reduce data redundancy and storage size to produce efficient query performance. In this technique, the read pattern identifies the access key of the query. The nested and multiple nested table merging techniques combined the tables that have the same access key to be in a nested form. The nested and multiple nested table merging on column oriented database leads the query to be performed on a single table to retrieve the data and thus improved query performance. Meanwhile, the rowkey design helps to determine the rowkey based on access keys that are identified in the read pattern technique. The experimental results showed that the proposed schema transformation technique managed to reduce data redundancy by eight column families thus reducing the storage size by 13.83% and improve the query performance time by 29.28% for DELL DVD dataset. While by using the Employees dataset, the proposed technique managed to reduce data redundancy by five column families thus reducing the storage size by 15.67% and improve the query performance time by 29.13%.