Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes

Many retrieval models and techniques can be applied to retrieve theses that are most relevant to certain queries or concepts. It has been found that different retrieval methods often retrieve different sets of relevant documents. It is therefore anticipated that a particular retrieval method will us...

Full description

Saved in:
Bibliographic Details
Main Author: Wahlan, Mohammed Salem Farag
Format: Thesis
Language:English
Published: 2006
Subjects:
Online Access:http://eprints.utm.my/id/eprint/4067/1/MohammedSalemFaragWahlanMFSKSM2006.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.4067
record_format uketd_dc
spelling my-utm-ep.40672018-01-15T04:24:11Z Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes 2006-03 Wahlan, Mohammed Salem Farag QA75 Electronic computers. Computer science Many retrieval models and techniques can be applied to retrieve theses that are most relevant to certain queries or concepts. It has been found that different retrieval methods often retrieve different sets of relevant documents. It is therefore anticipated that a particular retrieval method will usually retrieve some relevant theses not retrieved by other methods. Therefore in this study, different methods are used in the theses retrieval, based on different thesis structures, different similarity measures and different weighting schemes. The theses used in this study are collected from FSKSM postgraduate library. Many operations have been applied on the collected theses such as digitizing, stop words removal, stemming and building index. The results from these operations are stored in a database. In this study, 85 theses and 30 queries are used. The comparisons between query and theses were made using five similarity measures with seven weighting schemes using different thesis structures. The results show that the use of bibliography gives poorer results compared to the use of title and abstract alone. In the weighting schemes combinations, the results show that weighting schemes using Cosine and Tanimoto perform well individually but did not do well in the combinations and weighting schemes using Forbes and Russell similarity measures do not do well individually but did well in the combination. In the similarity measures combinations, the results show that the best combination was Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure but using abstract structure, the best combination was Cosine using TFIDF weighting scheme with Forbes using ATFA weighting scheme but it has less performance than the combination of Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure. The overall results show that the best thesis structure is title and the best similarity measure is Cosine with LTU weighting scheme. 2006-03 Thesis http://eprints.utm.my/id/eprint/4067/ http://eprints.utm.my/id/eprint/4067/1/MohammedSalemFaragWahlanMFSKSM2006.pdf application/pdf en public masters Universiti Teknologi Malaysia, Faculty of Computer Science and Information System Faculty of Computer Science and Information System
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic QA75 Electronic computers
Computer science
spellingShingle QA75 Electronic computers
Computer science
Wahlan, Mohammed Salem Farag
Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
description Many retrieval models and techniques can be applied to retrieve theses that are most relevant to certain queries or concepts. It has been found that different retrieval methods often retrieve different sets of relevant documents. It is therefore anticipated that a particular retrieval method will usually retrieve some relevant theses not retrieved by other methods. Therefore in this study, different methods are used in the theses retrieval, based on different thesis structures, different similarity measures and different weighting schemes. The theses used in this study are collected from FSKSM postgraduate library. Many operations have been applied on the collected theses such as digitizing, stop words removal, stemming and building index. The results from these operations are stored in a database. In this study, 85 theses and 30 queries are used. The comparisons between query and theses were made using five similarity measures with seven weighting schemes using different thesis structures. The results show that the use of bibliography gives poorer results compared to the use of title and abstract alone. In the weighting schemes combinations, the results show that weighting schemes using Cosine and Tanimoto perform well individually but did not do well in the combinations and weighting schemes using Forbes and Russell similarity measures do not do well individually but did well in the combination. In the similarity measures combinations, the results show that the best combination was Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure but using abstract structure, the best combination was Cosine using TFIDF weighting scheme with Forbes using ATFA weighting scheme but it has less performance than the combination of Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure. The overall results show that the best thesis structure is title and the best similarity measure is Cosine with LTU weighting scheme.
format Thesis
qualification_level Master's degree
author Wahlan, Mohammed Salem Farag
author_facet Wahlan, Mohammed Salem Farag
author_sort Wahlan, Mohammed Salem Farag
title Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_short Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_full Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_fullStr Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_full_unstemmed Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_sort comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
granting_institution Universiti Teknologi Malaysia, Faculty of Computer Science and Information System
granting_department Faculty of Computer Science and Information System
publishDate 2006
url http://eprints.utm.my/id/eprint/4067/1/MohammedSalemFaragWahlanMFSKSM2006.pdf
_version_ 1747814493207920640