Collective interaction filtering with graph-based descriptors for crowd behaviour analysis

Crowd behaviour analysis plays an important role in high security interests in public areas such as railway stations, shopping centres, and airports, where large populations gather. Crowd behaviour analysis framework can be divided into low-level, mid-level and high-level. This research is focuse...

Full description

Saved in:
Bibliographic Details
Main Author: Wong, Pei Voon
Format: Thesis
Language:English
Published: 2018
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/83244/1/FSKTM%202018%2085%20-%20ir.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Crowd behaviour analysis plays an important role in high security interests in public areas such as railway stations, shopping centres, and airports, where large populations gather. Crowd behaviour analysis framework can be divided into low-level, mid-level and high-level. This research is focused on problems of mid-level and high-level. The crowded scenes vary in various densities, structures and occlusion. It brings enormous challenges in effectively dividing detection feature points into cluster to develop dynamic group detector and grouping consistency between frames at mid-level. Besides that, it also poses challenges in identifying generic descriptors to describe motion dynamics caused by pedestrians walk in different directions with extremely diverse behaviours at high-level. Therefore, crowd behaviour analysis framework with enhanced mid and high levels approaches is used in this research to recognise the common properties across different crowded scenes. The recognised common properties are then used to identify generic descriptors from group-level for crowd behaviour classification and crowd video retrieval. At the low-level, motion feature extraction is performed to extract trajectories from each of the video frames. Kanade-Lucas-Tomasi feature point tracker is used to detect and track moving humans, and then tracklets are grouped to form trajectories. At the mid-level, a Collective Interaction Filtering is presented to identify groups by clustering trajectories. It is suitable for group detection in low, medium, and high crowds. At the high-level, the result of Collective Interaction Filtering is used in group motion pattern mining to predict collectiveness, uniformity, stability, and conflict generic descriptors. The generic descriptors identified are represented by graph-based descriptors. Graph-based descriptors are applied to crowd behaviour analysis and crowd video retrieval. All experiments are carried out using CUHK Crowd dataset. The group detection and crowd behaviour analysis ground truth results were provided by related work. The group detection experiment is implemented using the clustering algorithm. Normalized Mutual Information and Rand Index are used to measure the performance of Collective Interaction Filtering. The crowd behaviour analysis experiment is implemented by using non-linear Structural Support Vector Machine with RBF-kernel classifier. Leave-one-out is used to measure the performance of the proposed graph-based descriptors to describe crowd behaviour. The proposed crowd video retrieval approach based on generic descriptors experiment is implemented by using Euclidean distance and Chi-Square distance to measure the similarity matching generic descriptors between the query video and the retrieval set of videos. The crowd video retrieval performance is measured by the average precision in the top k retrieved samples. Experimental results show that the crowd behaviour analysis framework achieves the state-of-the-art performance on the CUHK Crowd dataset. The Collective Interaction Filtering outperforms the related work by achieving 0.55 for Normalized Mutual Information and 0.83 for Rand Index. The average accuracy of the proposed graph-based descriptors for crowd behaviour analysis is 80% compared to the previous works. The proposed crowd video retrieval approach based on graphbased descriptors obtained 49% in average top 10 precision. The performance improvement reveals the effectiveness of the graph-based descriptors for crowd video retrieval in different crowded scenes.