Object classification using deep learning

Object recognition is a process of identifying a specific object in an image or video sequence. This task is still a challenge for computer vision systems. Many different approaches of object recognition including the traditional classifier or deep neural network were proposed. The objective of this...

Full description

Saved in:
Bibliographic Details
Main Author: Fong, Soon Fei
Format: Thesis
Language:English
Published: 2015
Subjects:
Online Access:http://eprints.utm.my/id/eprint/53826/1/FongSoonFeiMFKE2015.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Object recognition is a process of identifying a specific object in an image or video sequence. This task is still a challenge for computer vision systems. Many different approaches of object recognition including the traditional classifier or deep neural network were proposed. The objective of this thesis is to implement a deep convolution neural network for object classification. Different architecture and different parameters have been tested in order to improve the classification accuracy. This thesis propose a very simple deep learning network for object classification which comprises only the basic data processing. In the proposed architecture, deep convolution neural network has a total of five hidden layers. After every convolution, there is a subsampling layer which consists of a 2×2 kernel to do average pooling. This can help to reduce the training time and compute complexity of the network. For comparison and better understanding, this work also showed how to fine tune the hyper-parameters of the network in order to obtain a higher degree of classification accuracy. This work achieved a good performance on Cifar-10 dataset where the accuracy is 76.19%. In challenging image databases such as Pascal and ImageNet, this network might not be sufficient to handle the variability. However, deep convolution neural network can be a valuable baseline for studying advanced deep learning architectures for large-scale image classification tasks. This network can be further improved by adding some validation data and dropout to prevent overfitting.