Evaluation of Visual Network Algorithms on Historical Documents

Visual network is a special type of graph representing real life systems where the vertices are accompanied with attributes and the edges represent relationships between them. Network visualisation facilitate comprehension of texts, especially for historical documents, where important events, facts...

Full description

Saved in:
Bibliographic Details
Main Author: Khairunnisa, Binti Ibrahim
Format: Thesis
Language:English
Published: 2020
Subjects:
Online Access:http://ir.unimas.my/id/eprint/29965/1/Evaluation...pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Visual network is a special type of graph representing real life systems where the vertices are accompanied with attributes and the edges represent relationships between them. Network visualisation facilitate comprehension of texts, especially for historical documents, where important events, facts and relationships are recorded. This study proposed a generic framework to perform evaluation of visual network algorithms to find the best network representation of a document. The framework suggests to evaluate both graph layout and clustering algorithm in order to produce a good network. The framework has been used to evaluate three graph layout and three graph clustering algorithms on the historical SAGA dataset. The evaluation found that FA2 algorithm when combined with MC algorithm produce the best network representation for SAGA. The evaluation also demonstrates that the scores given by evaluation metrics can disagree with one another as they each are invented based on different opinions on how to indicate a good cluster. The proposed framework is also applied on Biotext and dBPedia dataset and the findings implied that the performance of an algorithm, be it a layout or a clustering algorithm, actually depends on the structure of the document itself. Therefore, for a new document, evaluation of algorithms is ineluctable. The study also proposed a simple but reliable cluster evaluation metric called NPL-C metric. The metric is able to rate both the internal and external structure of clusters in a given network by using the concept of average path length and conductance.