Wiki saga: an approach for the digitisation, processing and visualisation of historical documents

A historical document contains information about past events which can be a source of reference. In this research, the selected historical document is the Sarawak Gazette, a monthly newspaper that reported on what happened in Sarawak. With one hundred and forty four years of reports since its fir...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Daniel Yong Wen
Format: Thesis
Language:English
Published: 2015
Subjects:
Online Access:http://ir.unimas.my/id/eprint/10769/1/Daniel.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-unimas-ir.10769
record_format uketd_dc
spelling my-unimas-ir.107692023-08-24T02:12:42Z Wiki saga: an approach for the digitisation, processing and visualisation of historical documents 2015 Tan, Daniel Yong Wen T Technology (General) A historical document contains information about past events which can be a source of reference. In this research, the selected historical document is the Sarawak Gazette, a monthly newspaper that reported on what happened in Sarawak. With one hundred and forty four years of reports since its first publication on Friday, August 26, 1870, the Sarawak Gazette is one of the most important historical document for information on the history of Sarawak. The task of gleaning for information by laboriously going through pages of printed pages is an arduous task in terms of time and effort. This research focuses on enabling a semantic search on the Sarawak Gazette, as a case study, for visualising a summary of what actually happened in Sarawak during a certain period. This research proposes a pipeline process that involves digitising the Sarawak Gazette, a natural language process that extracts named entities and a timeline generator to display events as reported. Due to the difficulties of the task, the current state-of-the-art approach makes use of human power as part of a mass digitisation projects by Google. A prototype system, Wiki SaGa, visualises the digitised documents in conjunction with the generated timeline. Through Wiki Saga, researchers who use the Sarawak Gazette can search for specific information on an event that happened in Sarawak during a certain timeframe by using the timeline display. By extracting named entities and displaying them within events in a timeline, researchers can have a summary of the event. By visualising events in a timeline, semantic patterns are recognised and related events can be identified. Through this research, Wiki Saga, a new archival and retrieval system, has been produced. In the process a semi-automated approach for digitising all the documents is also now available to researchers. Universiti Malaysia Sarawak, (UNIMAS) 2015 Thesis http://ir.unimas.my/id/eprint/10769/ http://ir.unimas.my/id/eprint/10769/1/Daniel.pdf text en validuser masters Universiti Malaysia Sarawak, (UNIMAS) Faculty of Computer Science and Information Technology. Indexed by Scopus
institution Universiti Malaysia Sarawak
collection UNIMAS Institutional Repository
language English
topic T Technology (General)
spellingShingle T Technology (General)
Tan, Daniel Yong Wen
Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
description A historical document contains information about past events which can be a source of reference. In this research, the selected historical document is the Sarawak Gazette, a monthly newspaper that reported on what happened in Sarawak. With one hundred and forty four years of reports since its first publication on Friday, August 26, 1870, the Sarawak Gazette is one of the most important historical document for information on the history of Sarawak. The task of gleaning for information by laboriously going through pages of printed pages is an arduous task in terms of time and effort. This research focuses on enabling a semantic search on the Sarawak Gazette, as a case study, for visualising a summary of what actually happened in Sarawak during a certain period. This research proposes a pipeline process that involves digitising the Sarawak Gazette, a natural language process that extracts named entities and a timeline generator to display events as reported. Due to the difficulties of the task, the current state-of-the-art approach makes use of human power as part of a mass digitisation projects by Google. A prototype system, Wiki SaGa, visualises the digitised documents in conjunction with the generated timeline. Through Wiki Saga, researchers who use the Sarawak Gazette can search for specific information on an event that happened in Sarawak during a certain timeframe by using the timeline display. By extracting named entities and displaying them within events in a timeline, researchers can have a summary of the event. By visualising events in a timeline, semantic patterns are recognised and related events can be identified. Through this research, Wiki Saga, a new archival and retrieval system, has been produced. In the process a semi-automated approach for digitising all the documents is also now available to researchers.
format Thesis
qualification_level Master's degree
author Tan, Daniel Yong Wen
author_facet Tan, Daniel Yong Wen
author_sort Tan, Daniel Yong Wen
title Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
title_short Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
title_full Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
title_fullStr Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
title_full_unstemmed Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
title_sort wiki saga: an approach for the digitisation, processing and visualisation of historical documents
granting_institution Universiti Malaysia Sarawak, (UNIMAS)
granting_department Faculty of Computer Science and Information Technology.
publishDate 2015
url http://ir.unimas.my/id/eprint/10769/1/Daniel.pdf
_version_ 1783728072807153664