Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection

Object detection becomes challenging due to feature unbalancing, less contextual information and class imbalance. The feature pyramid has been used to learn multiscale representation in modern detectors. However, the current version of the feature pyramid failed to integrate useful semantic informat...

Full description

Saved in:
Bibliographic Details
Main Author: Aziz, Lubna
Format: Thesis
Language:English
Published: 2022
Subjects:
Online Access:http://eprints.utm.my/id/eprint/101479/1/LubnaAzizPSC2022.pdf.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.101479
record_format uketd_dc
spelling my-utm-ep.1014792023-06-21T10:10:15Z Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection 2022 Aziz, Lubna QA75 Electronic computers. Computer science Object detection becomes challenging due to feature unbalancing, less contextual information and class imbalance. The feature pyramid has been used to learn multiscale representation in modern detectors. However, the current version of the feature pyramid failed to integrate useful semantic information across different scales. In addition, many negative anchors are generated during training, resulting in extreme class imbalance. This study proposed a Multi-Level Refinement Enriched Feature Pyramid Network (MREFP-Net) to jointly handle feature-level scale imbalance and class imbalance in object detection. Instead of designing a complex approach, a simple and effective multi-layered feature enrichment scheme was proposed that effectively combines deep, intermediate, and shallow features to obtain important semantic and spatial information for small object detection. In addition, a chained parallel pooling was proposed to capture rich background contextual information. A cascaded anchor refinement scheme was introduced to integrate useful multiscale contextual information into Single Shot MultiBox Detector's prediction layers to improve the multiscale detection's distinctiveness. The ultimate goal of the cascaded anchor refinement scheme was to counteract the class imbalance by refining anchors and enriching contextual features to improve regression and classification. The performance of MREFP-Net was evaluated using two benchmark datasets, MSCOCO and PASCAL VOC 07/ 12. For a 300 × 300 input on MS-COCO test-dev, MREFP-Net-ResNet101 achieved a state-of-the-art detection accuracy ???? of 36.6 with single-scale inference strategy and 39.2 ms on RTX 2060 GPU. For a 512 × 512 input on MS-COCO test-dev, MREFP-Net obtained an absolute gain of 2.5%. In particular, the results of MREFP-Net-VGG were benchmarked with 800 × 800 input on MS COCO test-dev: 49.2 ???? with a multiscale inference strategy. For 300 × 300 input, MREFP-Net achieved 82.5% ?????? on VOC07+12+COCO, and for 512 × 512 input, MREFP-Net obtained 84.6% ??????. Finally, feature visualization, object characteristic analysis and false-positive error analysis were performed to highlight the effectiveness of enriched features for small object detection. This study has proven that the proposed MREFP-Net was capable of detecting small objects and learning sensitive features to deal with scale, class imbalances, and appearance complexity across object instances. 2022 Thesis http://eprints.utm.my/id/eprint/101479/ http://eprints.utm.my/id/eprint/101479/1/LubnaAzizPSC2022.pdf.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:150788 phd doctoral Universiti Teknologi Malaysia Faculty of Engineering - School of Computing
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic QA75 Electronic computers
Computer science
spellingShingle QA75 Electronic computers
Computer science
Aziz, Lubna
Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
description Object detection becomes challenging due to feature unbalancing, less contextual information and class imbalance. The feature pyramid has been used to learn multiscale representation in modern detectors. However, the current version of the feature pyramid failed to integrate useful semantic information across different scales. In addition, many negative anchors are generated during training, resulting in extreme class imbalance. This study proposed a Multi-Level Refinement Enriched Feature Pyramid Network (MREFP-Net) to jointly handle feature-level scale imbalance and class imbalance in object detection. Instead of designing a complex approach, a simple and effective multi-layered feature enrichment scheme was proposed that effectively combines deep, intermediate, and shallow features to obtain important semantic and spatial information for small object detection. In addition, a chained parallel pooling was proposed to capture rich background contextual information. A cascaded anchor refinement scheme was introduced to integrate useful multiscale contextual information into Single Shot MultiBox Detector's prediction layers to improve the multiscale detection's distinctiveness. The ultimate goal of the cascaded anchor refinement scheme was to counteract the class imbalance by refining anchors and enriching contextual features to improve regression and classification. The performance of MREFP-Net was evaluated using two benchmark datasets, MSCOCO and PASCAL VOC 07/ 12. For a 300 × 300 input on MS-COCO test-dev, MREFP-Net-ResNet101 achieved a state-of-the-art detection accuracy ???? of 36.6 with single-scale inference strategy and 39.2 ms on RTX 2060 GPU. For a 512 × 512 input on MS-COCO test-dev, MREFP-Net obtained an absolute gain of 2.5%. In particular, the results of MREFP-Net-VGG were benchmarked with 800 × 800 input on MS COCO test-dev: 49.2 ???? with a multiscale inference strategy. For 300 × 300 input, MREFP-Net achieved 82.5% ?????? on VOC07+12+COCO, and for 512 × 512 input, MREFP-Net obtained 84.6% ??????. Finally, feature visualization, object characteristic analysis and false-positive error analysis were performed to highlight the effectiveness of enriched features for small object detection. This study has proven that the proposed MREFP-Net was capable of detecting small objects and learning sensitive features to deal with scale, class imbalances, and appearance complexity across object instances.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Aziz, Lubna
author_facet Aziz, Lubna
author_sort Aziz, Lubna
title Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_short Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_full Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_fullStr Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_full_unstemmed Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_sort multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
granting_institution Universiti Teknologi Malaysia
granting_department Faculty of Engineering - School of Computing
publishDate 2022
url http://eprints.utm.my/id/eprint/101479/1/LubnaAzizPSC2022.pdf.pdf
_version_ 1776100707717349376