Human Action Recognition With Temporal Dense Sampling Deep Neural Networks

In computer vision, Human Action Recognition (HAR) has always been an important study for human-computer interaction. With more and more effective algorithms in representation learning, specifically Convolutional Neural Network (ConvNet)-based architecture in computer vision, the breakthrough for HAR...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Kok Seang
Format: Thesis
Published: 2019
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-mmu-ep.7746
record_format uketd_dc
spelling my-mmu-ep.77462020-09-21T19:00:33Z Human Action Recognition With Temporal Dense Sampling Deep Neural Networks 2019-06 Tan, Kok Seang QA75.5-76.95 Electronic computers. Computer science In computer vision, Human Action Recognition (HAR) has always been an important study for human-computer interaction. With more and more effective algorithms in representation learning, specifically Convolutional Neural Network (ConvNet)-based architecture in computer vision, the breakthrough for HAR has been increasing over the past decades. However, HAR remains challenging due to the complicated changes of visual appearance over the sequence of image frames, such as inconsistent gestures and human position. To present human action video effectively, a sampling strategy, i.e., Temporal Dense Sampling (TDS) is introduced by incorporating temporal pooling into temporal segmentation in order to achieve dense sampling on the time axis. In this thesis, a pretrained Deep ConvNet from a large-scale image recognition task, namely InceptionResNet-V2, is transferred to the proposed HAR framework. In this way, not only the training resources can be reduced, but also useful insight about the environment can be fed into the proposed frameworks. Subsequently, three representation learning models: (1) Long-short Term Memory (LSTM) Network, (2) Bi-Directional Long-short Term Memory (BiLSTM) Network, and (3) 1-Dimensional (1D) ConvNet that are capable of modeling these spatio-temporal dynamics are proposed with TDS to perform HAR. In this thesis, these frameworks are named as: (1) Temporal Dense Sampling-LSTM Network (TDS-LSTMNet), (2) Fine-Tuned Temporal Dense Sampling-BiLSTM Network (FTDS-BiLSTMNet), and (3) Fine-Tuned Temporal Dense Sampling-1D ConvNet (FTDS-1DConvNet). 2019-06 Thesis http://shdl.mmu.edu.my/7746/ http://library.mmu.edu.my/library2/diglib/mmuetd/ masters Multimedia University Faculty of Information Science & Technology
institution Multimedia University
collection MMU Institutional Repository
topic QA75.5-76.95 Electronic computers
Computer science
spellingShingle QA75.5-76.95 Electronic computers
Computer science
Tan, Kok Seang
Human Action Recognition With Temporal Dense Sampling Deep Neural Networks
description In computer vision, Human Action Recognition (HAR) has always been an important study for human-computer interaction. With more and more effective algorithms in representation learning, specifically Convolutional Neural Network (ConvNet)-based architecture in computer vision, the breakthrough for HAR has been increasing over the past decades. However, HAR remains challenging due to the complicated changes of visual appearance over the sequence of image frames, such as inconsistent gestures and human position. To present human action video effectively, a sampling strategy, i.e., Temporal Dense Sampling (TDS) is introduced by incorporating temporal pooling into temporal segmentation in order to achieve dense sampling on the time axis. In this thesis, a pretrained Deep ConvNet from a large-scale image recognition task, namely InceptionResNet-V2, is transferred to the proposed HAR framework. In this way, not only the training resources can be reduced, but also useful insight about the environment can be fed into the proposed frameworks. Subsequently, three representation learning models: (1) Long-short Term Memory (LSTM) Network, (2) Bi-Directional Long-short Term Memory (BiLSTM) Network, and (3) 1-Dimensional (1D) ConvNet that are capable of modeling these spatio-temporal dynamics are proposed with TDS to perform HAR. In this thesis, these frameworks are named as: (1) Temporal Dense Sampling-LSTM Network (TDS-LSTMNet), (2) Fine-Tuned Temporal Dense Sampling-BiLSTM Network (FTDS-BiLSTMNet), and (3) Fine-Tuned Temporal Dense Sampling-1D ConvNet (FTDS-1DConvNet).
format Thesis
qualification_level Master's degree
author Tan, Kok Seang
author_facet Tan, Kok Seang
author_sort Tan, Kok Seang
title Human Action Recognition With Temporal Dense Sampling Deep Neural Networks
title_short Human Action Recognition With Temporal Dense Sampling Deep Neural Networks
title_full Human Action Recognition With Temporal Dense Sampling Deep Neural Networks
title_fullStr Human Action Recognition With Temporal Dense Sampling Deep Neural Networks
title_full_unstemmed Human Action Recognition With Temporal Dense Sampling Deep Neural Networks
title_sort human action recognition with temporal dense sampling deep neural networks
granting_institution Multimedia University
granting_department Faculty of Information Science & Technology
publishDate 2019
_version_ 1747829672582840320