Video annotation using convolution neural network

In this project, the problem addressed is human activity recognition (HAR) from video sequence. The focussing in this project is to annotate objects and actions in video using Convolutional Neural Network (CNN) and map their temporal relationship using full connected layer and softmax layer. The con...

Full description

Saved in:
Bibliographic Details
Main Author: Wan Abd. Kadir, Wan Zahiruddin
Format: Thesis
Language:English
Published: 2018
Subjects:
Online Access:http://eprints.utm.my/id/eprint/79083/1/WanZahiruddinWAbdKadirMFKE2018.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this project, the problem addressed is human activity recognition (HAR) from video sequence. The focussing in this project is to annotate objects and actions in video using Convolutional Neural Network (CNN) and map their temporal relationship using full connected layer and softmax layer. The contribution is a deep learning fusion framework that more effectively exploits spatial features from CNN model (Inception v3 model) and combined with fully connected layer and softmax layer for classifying the action in dataset. Dataset used was UCF11 with 11 classes of human action. This project also extensively evaluate their strength and weakness compared previous project. By combining both the set of features between Inception v3 model with fully connected layer and softmax layer can classify actions from UCF11 dataset effectively upto 100% for certain human actions. The lowest accuracy is 27% by using this method, because the background and motion is similar with other actions. The evaluation results demonstrate that this method can be used to classify action in video annotation.