Dual indexing and mutual summation based keyword search method for XML databases

XML has become the most common means for publication, storage and exchange of data over the Internet. As a result, a huge amount of information is stored and represented in XML, and research on keyword search in XML documents is on the increase as it allows users to find information they are interes...

Full description

Saved in:
Bibliographic Details
Main Author: Sethuramalingam, Selvaganesan
Format: Thesis
Published: 2014
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-mmu-ep.6268
record_format uketd_dc
spelling my-mmu-ep.62682016-01-12T11:06:26Z Dual indexing and mutual summation based keyword search method for XML databases 2014-06 Sethuramalingam, Selvaganesan QA76.75-76.765 Computer software XML has become the most common means for publication, storage and exchange of data over the Internet. As a result, a huge amount of information is stored and represented in XML, and research on keyword search in XML documents is on the increase as it allows users to find information they are interested in without having to know the underlying database schema or complex query language. In XML keyword search, the accurate identification of user search intention and ranking of the result in the presence of keyword ambiguities have been challenging problems. In this thesis, we propose a XML keyword search using Dual indexing and Mutual summation based Algorithm (XDMA) to address the problems in XReal and other XML keyword search approaches. Our proposed approach builds dual indices, namely, tag information table and data node information table for structural node and data node in XML database respectively. Moreover, we propose a keyword search technique to select all possible T-typed nodes for a given query using the two-level matching between the two indices. Using this search technique, keywords in a given query can be identified and distinguished as tags or data while searching for an input query. Furthermore, another new keyword ambiguity, Ambiguity 4, i.e., A keyword can exist as the name of a tag for node types having different data (text) values and vice versa, is identified and addressed in this thesis. Subsequently, a new concept called mutual summation is proposed for a pair of random variables. By incorporating the concept of dependence of two indices and the concept of mutual summation, we define the Mutual Score (MScore) between selected tags and query keywords to find the desired node of type T. 2014-06 Thesis http://shdl.mmu.edu.my/6268/ http://library.mmu.edu.my/diglib/onlinedb/dig_lib.php phd doctoral Multimedia University Faculty of Computing and Informatics
institution Multimedia University
collection MMU Institutional Repository
topic QA76.75-76.765 Computer software
spellingShingle QA76.75-76.765 Computer software
Sethuramalingam, Selvaganesan
Dual indexing and mutual summation based keyword search method for XML databases
description XML has become the most common means for publication, storage and exchange of data over the Internet. As a result, a huge amount of information is stored and represented in XML, and research on keyword search in XML documents is on the increase as it allows users to find information they are interested in without having to know the underlying database schema or complex query language. In XML keyword search, the accurate identification of user search intention and ranking of the result in the presence of keyword ambiguities have been challenging problems. In this thesis, we propose a XML keyword search using Dual indexing and Mutual summation based Algorithm (XDMA) to address the problems in XReal and other XML keyword search approaches. Our proposed approach builds dual indices, namely, tag information table and data node information table for structural node and data node in XML database respectively. Moreover, we propose a keyword search technique to select all possible T-typed nodes for a given query using the two-level matching between the two indices. Using this search technique, keywords in a given query can be identified and distinguished as tags or data while searching for an input query. Furthermore, another new keyword ambiguity, Ambiguity 4, i.e., A keyword can exist as the name of a tag for node types having different data (text) values and vice versa, is identified and addressed in this thesis. Subsequently, a new concept called mutual summation is proposed for a pair of random variables. By incorporating the concept of dependence of two indices and the concept of mutual summation, we define the Mutual Score (MScore) between selected tags and query keywords to find the desired node of type T.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Sethuramalingam, Selvaganesan
author_facet Sethuramalingam, Selvaganesan
author_sort Sethuramalingam, Selvaganesan
title Dual indexing and mutual summation based keyword search method for XML databases
title_short Dual indexing and mutual summation based keyword search method for XML databases
title_full Dual indexing and mutual summation based keyword search method for XML databases
title_fullStr Dual indexing and mutual summation based keyword search method for XML databases
title_full_unstemmed Dual indexing and mutual summation based keyword search method for XML databases
title_sort dual indexing and mutual summation based keyword search method for xml databases
granting_institution Multimedia University
granting_department Faculty of Computing and Informatics
publishDate 2014
_version_ 1747829617271504896