Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
A historical document contains information about past events which can be a source of reference. In this research, the selected historical document is the Sarawak Gazette, a monthly newspaper that reported on what happened in Sarawak. With one hundred and forty four years of reports since its fir...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/10769/1/Daniel.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-unimas-ir.10769 |
---|---|
record_format |
uketd_dc |
spelling |
my-unimas-ir.107692023-08-24T02:12:42Z Wiki saga: an approach for the digitisation, processing and visualisation of historical documents 2015 Tan, Daniel Yong Wen T Technology (General) A historical document contains information about past events which can be a source of reference. In this research, the selected historical document is the Sarawak Gazette, a monthly newspaper that reported on what happened in Sarawak. With one hundred and forty four years of reports since its first publication on Friday, August 26, 1870, the Sarawak Gazette is one of the most important historical document for information on the history of Sarawak. The task of gleaning for information by laboriously going through pages of printed pages is an arduous task in terms of time and effort. This research focuses on enabling a semantic search on the Sarawak Gazette, as a case study, for visualising a summary of what actually happened in Sarawak during a certain period. This research proposes a pipeline process that involves digitising the Sarawak Gazette, a natural language process that extracts named entities and a timeline generator to display events as reported. Due to the difficulties of the task, the current state-of-the-art approach makes use of human power as part of a mass digitisation projects by Google. A prototype system, Wiki SaGa, visualises the digitised documents in conjunction with the generated timeline. Through Wiki Saga, researchers who use the Sarawak Gazette can search for specific information on an event that happened in Sarawak during a certain timeframe by using the timeline display. By extracting named entities and displaying them within events in a timeline, researchers can have a summary of the event. By visualising events in a timeline, semantic patterns are recognised and related events can be identified. Through this research, Wiki Saga, a new archival and retrieval system, has been produced. In the process a semi-automated approach for digitising all the documents is also now available to researchers. Universiti Malaysia Sarawak, (UNIMAS) 2015 Thesis http://ir.unimas.my/id/eprint/10769/ http://ir.unimas.my/id/eprint/10769/1/Daniel.pdf text en validuser masters Universiti Malaysia Sarawak, (UNIMAS) Faculty of Computer Science and Information Technology. Indexed by Scopus |
institution |
Universiti Malaysia Sarawak |
collection |
UNIMAS Institutional Repository |
language |
English |
topic |
T Technology (General) |
spellingShingle |
T Technology (General) Tan, Daniel Yong Wen Wiki saga: an approach for the digitisation, processing and visualisation of historical documents |
description |
A historical document contains information about past events which can be a source of
reference. In this research, the selected historical document is the Sarawak Gazette, a monthly
newspaper that reported on what happened in Sarawak. With one hundred and forty four years
of reports since its first publication on Friday, August 26, 1870, the Sarawak Gazette is one of
the most important historical document for information on the history of Sarawak. The task of
gleaning for information by laboriously going through pages of printed pages is an arduous
task in terms of time and effort. This research focuses on enabling a semantic search on the
Sarawak Gazette, as a case study, for visualising a summary of what actually happened in
Sarawak during a certain period. This research proposes a pipeline process that involves
digitising the Sarawak Gazette, a natural language process that extracts named entities and a
timeline generator to display events as reported. Due to the difficulties of the task, the current
state-of-the-art approach makes use of human power as part of a mass digitisation projects by
Google. A prototype system, Wiki SaGa, visualises the digitised documents in conjunction
with the generated timeline. Through Wiki Saga, researchers who use the Sarawak Gazette can
search for specific information on an event that happened in Sarawak during a certain
timeframe by using the timeline display. By extracting named entities and displaying them
within events in a timeline, researchers can have a summary of the event. By visualising events
in a timeline, semantic patterns are recognised and related events can be identified. Through
this research, Wiki Saga, a new archival and retrieval system, has been produced. In the process
a semi-automated approach for digitising all the documents is also now available to researchers. |
format |
Thesis |
qualification_level |
Master's degree |
author |
Tan, Daniel Yong Wen |
author_facet |
Tan, Daniel Yong Wen |
author_sort |
Tan, Daniel Yong Wen |
title |
Wiki saga: an approach for the digitisation, processing and visualisation of historical documents |
title_short |
Wiki saga: an approach for the digitisation, processing and visualisation of historical documents |
title_full |
Wiki saga: an approach for the digitisation, processing and visualisation of historical documents |
title_fullStr |
Wiki saga: an approach for the digitisation, processing and visualisation of historical documents |
title_full_unstemmed |
Wiki saga: an approach for the digitisation, processing and visualisation of historical documents |
title_sort |
wiki saga: an approach for the digitisation, processing and visualisation of historical documents |
granting_institution |
Universiti Malaysia Sarawak, (UNIMAS) |
granting_department |
Faculty of Computer Science and Information Technology. |
publishDate |
2015 |
url |
http://ir.unimas.my/id/eprint/10769/1/Daniel.pdf |
_version_ |
1783728072807153664 |