Document ranking using information quality criteria in weblog search engine

Social media has revolutionized the Web industry. Weblog medium, fundamentally,is an innovation in personal publishing. It has also come to engender a new form of social interaction on the web. Because much firsthand information is recorded in blog posts, more and more people tend to search their w...

Full description

Saved in:
Bibliographic Details
Main Author: Azimzadeh, Fatemeh
Format: Thesis
Language:English
Published: 2013
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/38937/1/FK%202013%204R.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Social media has revolutionized the Web industry. Weblog medium, fundamentally,is an innovation in personal publishing. It has also come to engender a new form of social interaction on the web. Because much firsthand information is recorded in blog posts, more and more people tend to search their wanted information on blog sites. A major problem is that a weblog includes nontraditional features of the Web pages such as Weblog post, links, tags, and comments. Thus, the use of traditional rank algorithms like PageRank and HITS in general search engines are not appropriate to evaluate the Weblog posts because such algorithms do not consider the blog specific features. On the other hand, information quality criteria are important factors for the users. From Weblogs, which have unfiltered information without expert peer review, users expect that search engines deliver quality information for their queries. There has been little framework which consider information quality criteria in the Weblog search engine. This thesis establishes an integrated framework which incorporates information quality criteria into the ranking function of search engine on Persian weblogs. The presented framework rank Weblogs and posts based on the selected information quality criteria. Then, the ranking scores are merged with relevancy in the search engine. A ranking method is developed for the Weblog search engine where the post is considered as the document retrieved. This thesis proposes two ranking functions in the search engine which are combined with the information quality criteria, and then compared with a PageRank based ranking function. The results reveal that combination of quality criteria with relevancy, without suitable weight for each one, does not lead to user’s satisfaction. Instead, applying proper weights to both information quality factors and relevancy intelligibly improve the results of the search engine and consequently lead to user satisfaction.