Hybrid-based labeling scheme for mapping extensible markup language (XML) to relational database

eXtensible Markup Language (XML) is the de facto standard for data exchange over the World Wide Web in many application domains such as document repositories, digital libraries and business transactions. However, these application data are subject to frequent changes. In order to make XML into a ful...

Full description

Saved in:
Bibliographic Details
Main Author: Tengku Mohd Amin, Tengku Aisyah Asyikin
Format: Thesis
Published: 2020
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-mmu-ep.12985
record_format uketd_dc
spelling my-mmu-ep.129852024-09-26T04:03:10Z Hybrid-based labeling scheme for mapping extensible markup language (XML) to relational database 2020-12 Tengku Mohd Amin, Tengku Aisyah Asyikin QA75-76.95 Calculating machines eXtensible Markup Language (XML) is the de facto standard for data exchange over the World Wide Web in many application domains such as document repositories, digital libraries and business transactions. However, these application data are subject to frequent changes. In order to make XML into a full-featured data exchange format, it is essential to support not only queries but dynamic updates (insert, update and delete operations) over XML content. On the otherhand, some of labeling schemes require to relabel the whole XML tree. As a result, it will increase the XML database size. As such, a persistent, robust and durable labeling scheme which avoids re-labeling is very much desirable. The first part of this research concentrates on designing a robust and persistent labeling scheme, which supports dynamic updates in XML databases. Relational Database (RDB) is used as the repository due to the fact that RDBs are still the most popular back-end storage in most organizations. Since XML and RDB are in different format, an efficient mapping technique is certainly required. As such, in the second part of the research, our goal is to implement a mapping algorithm between XML and RDB. The contribution of the thesis can be summarized as follows. Firstly, a robust labeling scheme known as ORD-GAP is proposed. This labeling scheme is a rangebased labeling scheme, which assigns certain gap between each node to support future insertion. Secondly, a mapping scheme which built upon ORD-GAP labeling scheme is proposed in order to transform XML into RDB. Finally, in order to demonstrate that ORD-GAP is robust enough for dynamic updates, this method has implemented three use cases, namely (i) left-most insertion, (ii) in-between insertion and (iii) right-most insertion for the evaluation. A mapping scheme of ORG-GAP adapt from the ORDPath insertion and uses model-mapping approach to store the XML document structure. This method uses two tables to store the data from XML documents. The two tables are internal and text tables. Experimental evaluations demonstrated that ORD-GAP outperformed some existing approaches such as ORDPath and ME Labeling in terms of data loading time, query retrieval time and database storage size. On average, ORD-GAP has the best storing and query retrieval time. From the observation, ORD-GAP takes longer time for the data loading as it needs to buffer some time for some initial calculation. Nevertheless, data loading is usually executed once only in most organization as compared to query retrieval. 2020-12 Thesis https://shdl.mmu.edu.my/12985/ http://erep.mmu.edu.my/ masters Multimedia University Faculty of Computing and Informatics (FCI) EREP ID: 10290
institution Multimedia University
collection MMU Institutional Repository
topic QA75-76.95 Calculating machines
spellingShingle QA75-76.95 Calculating machines
Tengku Mohd Amin, Tengku Aisyah Asyikin
Hybrid-based labeling scheme for mapping extensible markup language (XML) to relational database
description eXtensible Markup Language (XML) is the de facto standard for data exchange over the World Wide Web in many application domains such as document repositories, digital libraries and business transactions. However, these application data are subject to frequent changes. In order to make XML into a full-featured data exchange format, it is essential to support not only queries but dynamic updates (insert, update and delete operations) over XML content. On the otherhand, some of labeling schemes require to relabel the whole XML tree. As a result, it will increase the XML database size. As such, a persistent, robust and durable labeling scheme which avoids re-labeling is very much desirable. The first part of this research concentrates on designing a robust and persistent labeling scheme, which supports dynamic updates in XML databases. Relational Database (RDB) is used as the repository due to the fact that RDBs are still the most popular back-end storage in most organizations. Since XML and RDB are in different format, an efficient mapping technique is certainly required. As such, in the second part of the research, our goal is to implement a mapping algorithm between XML and RDB. The contribution of the thesis can be summarized as follows. Firstly, a robust labeling scheme known as ORD-GAP is proposed. This labeling scheme is a rangebased labeling scheme, which assigns certain gap between each node to support future insertion. Secondly, a mapping scheme which built upon ORD-GAP labeling scheme is proposed in order to transform XML into RDB. Finally, in order to demonstrate that ORD-GAP is robust enough for dynamic updates, this method has implemented three use cases, namely (i) left-most insertion, (ii) in-between insertion and (iii) right-most insertion for the evaluation. A mapping scheme of ORG-GAP adapt from the ORDPath insertion and uses model-mapping approach to store the XML document structure. This method uses two tables to store the data from XML documents. The two tables are internal and text tables. Experimental evaluations demonstrated that ORD-GAP outperformed some existing approaches such as ORDPath and ME Labeling in terms of data loading time, query retrieval time and database storage size. On average, ORD-GAP has the best storing and query retrieval time. From the observation, ORD-GAP takes longer time for the data loading as it needs to buffer some time for some initial calculation. Nevertheless, data loading is usually executed once only in most organization as compared to query retrieval.
format Thesis
qualification_level Master's degree
author Tengku Mohd Amin, Tengku Aisyah Asyikin
author_facet Tengku Mohd Amin, Tengku Aisyah Asyikin
author_sort Tengku Mohd Amin, Tengku Aisyah Asyikin
title Hybrid-based labeling scheme for mapping extensible markup language (XML) to relational database
title_short Hybrid-based labeling scheme for mapping extensible markup language (XML) to relational database
title_full Hybrid-based labeling scheme for mapping extensible markup language (XML) to relational database
title_fullStr Hybrid-based labeling scheme for mapping extensible markup language (XML) to relational database
title_full_unstemmed Hybrid-based labeling scheme for mapping extensible markup language (XML) to relational database
title_sort hybrid-based labeling scheme for mapping extensible markup language (xml) to relational database
granting_institution Multimedia University
granting_department Faculty of Computing and Informatics (FCI)
publishDate 2020
_version_ 1811768020953989120