Offline arabic character recognition using genetic approach

Many optical character recognition (OCR) techniques and tools have been developed for plurality of languages. A successful OCR system improves interactivity between humans and computers in many applications such as digitising and recognising written content. With regard to Arabic OCR, the problem of...

Full description

Saved in:
Bibliographic Details
Main Author: Aljuaid, Hanan Abdulrahman
Format: Thesis
Language:English
Published: 2010
Subjects:
Online Access:http://eprints.utm.my/id/eprint/16567/7/HananAbdulrahmanAljuaidMFSKSM2010.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.16567
record_format uketd_dc
spelling my-utm-ep.165672017-09-18T06:07:28Z Offline arabic character recognition using genetic approach 2010 Aljuaid, Hanan Abdulrahman QA75 Electronic computers. Computer science Many optical character recognition (OCR) techniques and tools have been developed for plurality of languages. A successful OCR system improves interactivity between humans and computers in many applications such as digitising and recognising written content. With regard to Arabic OCR, the problem of handwriting recognition is challenging because Arabic letters are cursive and shapechangeable depending on their positions. OCR systems have reached nearly perfect acknowledgement of Arabic printed text, yet still in its inception and needs to be greatly improved with handwritten text. Therefore in this study, an approach to recognize Arabic characters based on genetic algorithms (GA) is proposed. The approach requires two separate stages; feature extraction and GA for character recognition development. In the feature extraction stage, six features are detected for each character and denoted as a feature vector of 6 integer numbers. The feature vectors are then utilised in the next stage. Three genetic operators namely selection, crossover and mutation are implemented to search for the similar vectors with the best fitness value to recognise the character. The data used in this study were collected from different resources and stored in a database. It consists of 12,500 printed text words in 50 paragraphs and 15,000 words written by 100 different writers, males and females aged 5 to 60 years. Pre-processing operations are conducted including segmenting paragraphs into lines, segmenting line into words, segmenting words into characters, detecting skeleton, and determining baseline and other horizontal zones. The experimental results have shown that the proposed method has achieved promising accuracy recognition rate with 90.46% for printed text and handwritten characters. 2010 Thesis http://eprints.utm.my/id/eprint/16567/ http://eprints.utm.my/id/eprint/16567/7/HananAbdulrahmanAljuaidMFSKSM2010.pdf application/pdf en public masters Universiti Teknologi Malaysia, Faculty of Computer Science and Information System Faculty of Computer Science and Information System
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic QA75 Electronic computers
Computer science
spellingShingle QA75 Electronic computers
Computer science
Aljuaid, Hanan Abdulrahman
Offline arabic character recognition using genetic approach
description Many optical character recognition (OCR) techniques and tools have been developed for plurality of languages. A successful OCR system improves interactivity between humans and computers in many applications such as digitising and recognising written content. With regard to Arabic OCR, the problem of handwriting recognition is challenging because Arabic letters are cursive and shapechangeable depending on their positions. OCR systems have reached nearly perfect acknowledgement of Arabic printed text, yet still in its inception and needs to be greatly improved with handwritten text. Therefore in this study, an approach to recognize Arabic characters based on genetic algorithms (GA) is proposed. The approach requires two separate stages; feature extraction and GA for character recognition development. In the feature extraction stage, six features are detected for each character and denoted as a feature vector of 6 integer numbers. The feature vectors are then utilised in the next stage. Three genetic operators namely selection, crossover and mutation are implemented to search for the similar vectors with the best fitness value to recognise the character. The data used in this study were collected from different resources and stored in a database. It consists of 12,500 printed text words in 50 paragraphs and 15,000 words written by 100 different writers, males and females aged 5 to 60 years. Pre-processing operations are conducted including segmenting paragraphs into lines, segmenting line into words, segmenting words into characters, detecting skeleton, and determining baseline and other horizontal zones. The experimental results have shown that the proposed method has achieved promising accuracy recognition rate with 90.46% for printed text and handwritten characters.
format Thesis
qualification_level Master's degree
author Aljuaid, Hanan Abdulrahman
author_facet Aljuaid, Hanan Abdulrahman
author_sort Aljuaid, Hanan Abdulrahman
title Offline arabic character recognition using genetic approach
title_short Offline arabic character recognition using genetic approach
title_full Offline arabic character recognition using genetic approach
title_fullStr Offline arabic character recognition using genetic approach
title_full_unstemmed Offline arabic character recognition using genetic approach
title_sort offline arabic character recognition using genetic approach
granting_institution Universiti Teknologi Malaysia, Faculty of Computer Science and Information System
granting_department Faculty of Computer Science and Information System
publishDate 2010
url http://eprints.utm.my/id/eprint/16567/7/HananAbdulrahmanAljuaidMFSKSM2010.pdf
_version_ 1747815074409480192