Human Action Recognition With Temporal Dense Sampling Deep Neural Networks

In computer vision, Human Action Recognition (HAR) has always been an important study for human-computer interaction. With more and more effective algorithms in representation learning, speciﬁcally Convolutional Neural Network (ConvNet)-based architecture in computer vision, the breakthrough for HAR...

全面介绍

Saved in:

书目详细资料
主要作者:	Tan, Kok Seang
格式:	Thesis
出版:	2019
主题:	QA75.5-76.95 Electronic computers Computer science
标签:	添加标签没有标签, 成为第一个标记此记录!

实物特征
总结:	In computer vision, Human Action Recognition (HAR) has always been an important study for human-computer interaction. With more and more effective algorithms in representation learning, speciﬁcally Convolutional Neural Network (ConvNet)-based architecture in computer vision, the breakthrough for HAR has been increasing over the past decades. However, HAR remains challenging due to the complicated changes of visual appearance over the sequence of image frames, such as inconsistent gestures and human position. To present human action video effectively, a sampling strategy, i.e., Temporal Dense Sampling (TDS) is introduced by incorporating temporal pooling into temporal segmentation in order to achieve dense sampling on the time axis. In this thesis, a pretrained Deep ConvNet from a large-scale image recognition task, namely InceptionResNet-V2, is transferred to the proposed HAR framework. In this way, not only the training resources can be reduced, but also useful insight about the environment can be fed into the proposed frameworks. Subsequently, three representation learning models: (1) Long-short Term Memory (LSTM) Network, (2) Bi-Directional Long-short Term Memory (BiLSTM) Network, and (3) 1-Dimensional (1D) ConvNet that are capable of modeling these spatio-temporal dynamics are proposed with TDS to perform HAR. In this thesis, these frameworks are named as: (1) Temporal Dense Sampling-LSTM Network (TDS-LSTMNet), (2) Fine-Tuned Temporal Dense Sampling-BiLSTM Network (FTDS-BiLSTMNet), and (3) Fine-Tuned Temporal Dense Sampling-1D ConvNet (FTDS-1DConvNet).

Human Action Recognition With Temporal Dense Sampling Deep Neural Networks

相似书籍