A comparative study of instance-based schema matching in relational database /
Schema matching is deemed to be indispensable process for database integration in many contemporary database systems. The aim of schema matching is to identify the correlation across a schema which eventually serves the data integration process. The main issue concern for data integration is to supp...
Saved in:
主要作者: | |
---|---|
格式: | Thesis |
语言: | English |
出版: |
Kuala Lumpur :
Kulliyyah of Information and Communication Technology, International islamic University Malaysia,
2017
|
主题: | |
在线阅读: | http://studentrepo.iium.edu.my/handle/123456789/5655 |
标签: |
添加标签
没有标签, 成为第一个标记此记录!
|
总结: | Schema matching is deemed to be indispensable process for database integration in many contemporary database systems. The aim of schema matching is to identify the correlation across a schema which eventually serves the data integration process. The main issue concern for data integration is to support the merging decision advocating correspondence among attributes of heterogeneous data sources. Numerous schema matching techniques have been suggested in literature for utilizing database instances in detecting correspondence between attributes. However, no single technique managed to provide an accurate and comprehensive match for different types of data. In other words, some of the techniques treat numeric values as strings which undoubtedly adversely affected the match and further, the quality result of the matches. Likewise, other techniques tend to treat textual instances as numeric which might negatively influence the accuracy of the match. Thus, this thesis aims at investigating the performance of two different instance-based schema matching techniques. The study emphasizes on exploring the strengths and the weaknesses of each technique over various types of data sets. The study focuses on developing a syntactic instance-based schema matching technique named Regular Expression (RegEx) with WordNet database. While selecting Google similarity as a semantic instance-based schema matching technique. Both methods have been evaluated over three different data types, namely: (i) numeric, (ii) alphabetic, and (iii) mixed data types. Several analyses have been performed on real and synthetic data sets aiming at examining the match accuracy with respect to precision (P), recall (R) and F-measure (F). |
---|---|
实物描述: | xi, 104 leaves : illustrations ; 30cm. |
参考书目: | Includes bibliographical references (leaves 94-104). |