Illumination removal and text segmnetation for Al-Quran using binary representation

Segmentation process for segmenting Al-Quran needs to be studied carefully. This is because Al-Quran is the book of Allah swt. Any incorrect segmentation will affect the holiness of Al-Quran. A major difficulty is the appearance of illumination around text areas as well as of noisy black stripes. In...

Full description

Saved in:
Bibliographic Details
Main Author: Nazeeh Jamil Bany Melhem
Format: Thesis
Language:English
English
Published: 2015
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/15884/1/ILLUMINATION%20REMOVAL%20AND%20TEXT%20SEGMENTATION%20FOR%20AL-QURAN%20USING%20BINARY%20REPRESENTATION%20%2824%20pgs%29.pdf
http://eprints.utem.edu.my/id/eprint/15884/2/Illumination%20removal%20and%20text%20segmnetation%20for%20Al-Quran%20using%20binary%20representation.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utem-ep.15884
record_format uketd_dc
institution Universiti Teknikal Malaysia Melaka
collection UTeM Repository
language English
English
advisor Azmi, Mohd Sanusi
topic TA Engineering (General)
Civil engineering (General)
spellingShingle TA Engineering (General)
Civil engineering (General)
Nazeeh Jamil Bany Melhem
Illumination removal and text segmnetation for Al-Quran using binary representation
description Segmentation process for segmenting Al-Quran needs to be studied carefully. This is because Al-Quran is the book of Allah swt. Any incorrect segmentation will affect the holiness of Al-Quran. A major difficulty is the appearance of illumination around text areas as well as of noisy black stripes. In this study, we propose a novel algorithm for detecting the illumination on Al-Quran page. Our aim is to segment Al-Quran pages to pages without illumination, and to segment Al-Quran pages to text line images without any changes on the content. First we apply a pre-processing which includes binarization. Then, we detect the illumination of Al-Quran pages. In this stage, we introduce the vertical and horizontal white percentages which have been proved efficient for detecting the illumination. Finally, the new images are segmented to text line. The experimental results on several Al-Quran pages from different Al-Quran style demonstrate the effectiveness of the proposed technique.
format Thesis
qualification_name Master of Philosophy (M.Phil.)
qualification_level Master's degree
author Nazeeh Jamil Bany Melhem
author_facet Nazeeh Jamil Bany Melhem
author_sort Nazeeh Jamil Bany Melhem
title Illumination removal and text segmnetation for Al-Quran using binary representation
title_short Illumination removal and text segmnetation for Al-Quran using binary representation
title_full Illumination removal and text segmnetation for Al-Quran using binary representation
title_fullStr Illumination removal and text segmnetation for Al-Quran using binary representation
title_full_unstemmed Illumination removal and text segmnetation for Al-Quran using binary representation
title_sort illumination removal and text segmnetation for al-quran using binary representation
granting_institution Universiti Teknikal Malaysia Melaka
granting_department Faculty of Information and Communication Technology
publishDate 2015
url http://eprints.utem.edu.my/id/eprint/15884/1/ILLUMINATION%20REMOVAL%20AND%20TEXT%20SEGMENTATION%20FOR%20AL-QURAN%20USING%20BINARY%20REPRESENTATION%20%2824%20pgs%29.pdf
http://eprints.utem.edu.my/id/eprint/15884/2/Illumination%20removal%20and%20text%20segmnetation%20for%20Al-Quran%20using%20binary%20representation.pdf
_version_ 1747833880545591296
spelling my-utem-ep.158842022-04-19T10:19:29Z Illumination removal and text segmnetation for Al-Quran using binary representation 2015 Nazeeh Jamil Bany Melhem TA Engineering (General). Civil engineering (General) Segmentation process for segmenting Al-Quran needs to be studied carefully. This is because Al-Quran is the book of Allah swt. Any incorrect segmentation will affect the holiness of Al-Quran. A major difficulty is the appearance of illumination around text areas as well as of noisy black stripes. In this study, we propose a novel algorithm for detecting the illumination on Al-Quran page. Our aim is to segment Al-Quran pages to pages without illumination, and to segment Al-Quran pages to text line images without any changes on the content. First we apply a pre-processing which includes binarization. Then, we detect the illumination of Al-Quran pages. In this stage, we introduce the vertical and horizontal white percentages which have been proved efficient for detecting the illumination. Finally, the new images are segmented to text line. The experimental results on several Al-Quran pages from different Al-Quran style demonstrate the effectiveness of the proposed technique. 2015 Thesis http://eprints.utem.edu.my/id/eprint/15884/ http://eprints.utem.edu.my/id/eprint/15884/1/ILLUMINATION%20REMOVAL%20AND%20TEXT%20SEGMENTATION%20FOR%20AL-QURAN%20USING%20BINARY%20REPRESENTATION%20%2824%20pgs%29.pdf text en public http://eprints.utem.edu.my/id/eprint/15884/2/Illumination%20removal%20and%20text%20segmnetation%20for%20Al-Quran%20using%20binary%20representation.pdf text en validuser https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=96209 mphil masters Universiti Teknikal Malaysia Melaka Faculty of Information and Communication Technology Azmi, Mohd Sanusi 1. Agrawal, M. and Doermann, D., 2013. Clutter noise removal in binary document images. International Journal on Document Analysis and Recognition, 16(4), pp.351–369. 2. Al-Emami, S. and Usher, M., 1990. On-line recognition of handwritten Arabic characters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7), pp.704–710. 3. Alimi, a. M., 1997. An evolutionary neuro-fuzzy approach to recognize on-line Arabic handwriting. Proceedings of the Fourth International Conference on Document Analysis and Recognition, 1. 4. Al-Muhtaseb, H. a., Mahmoud, S. a. and Qahwaji, R.S., 2008. Recognition of off-line printed Arabic text using Hidden Markov Models. Signal Processing, 88(12), pp.2902–2912. 5. Arivazhagan, M., 2007. A statistical approach to line segmentation in handwritten documents. Document Recognition and Retrieval XIV, Proceedings of SPIE, San Jose, CA, USA, 6500, pp.65000T–1–11. 6. Ávila, B.T. and Lins, R.D., 2004. A new algorithm for removing noisy borders from monochromatic documents. Proceedings of the 2004 ACM symposium on Applied computing - SAC ’04, p.1219. 7. Azmi, M.S., 2013. Fitur Baharu Dari Kombinasi Geometri Segitiga dan Pengezonan utk Paleografi Jawi Digital. 8. Baird, H.S., 1994. Background structure in document images. ocument Image Analysi, pp.17–34. 9. Basy, S. et al., 2008. Text line extraction from multi-skewed handwritten documents. Proceedings of the 27th Chinese Control Conference, CCC, 40, pp.412–415. 10. Bidgoli, a. M. and Boraghi, M., 2010. A language independent text segmentation technique based on naive bayes classifier. 2010 International Conference on Signal and Image Processing, pp.11–16. 11. Breuel, T.M., 2002. Two Algorithms for Geometric Layout Analysis. Proceedings of the Workshop on Document Analysis Systems, Princeton, NJ, USA. 2002 pp. 188–199. 12. Bruzzone, E. and Coffetti, M.C., 1999. An algorithm for extracting cursive text lines. Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR ’99 (Cat. No.PR00318), pp.2–5. 13. Câmara, G., Souza, R.C.M., Freitas, U.M. and Garrido, J., 1996. Spring: Integrating remote sensing and gis by object-oriented data modelling. Computers and Graphics (Pergamon), 20(3), pp.395–403. 14. Cheung, A., Bennamoun, M. and Bergmann, N.W., 2001. Arabic optical character recognition system using recognition-based segmentation. Pattern Recognition, 34(2), pp.215–233. 15. Dey, S., Mukhopadhyay, J., Sural, S. and Bhowmick, P., 2012. Margin Noise Removal From Printed Document Images. Workshop on Document Analysis and Recognition, (iv), pp.86–93. 16. Du, X., Pan, W. and Bui, T.D., 2008. Text line segmentation in handwritten documents using Mumford-Shah model. Pattern Recognition, 42(12), pp.3136–3145. 17. Duda, R.O. and Hart, P.E., 1972. Use of the Hough transformation to detect lines and curves in pictures. , 15(April 1971), pp.11–15. 18. Fan, K.C., Wang, Y.K. and Lay, T.R., 2002. Marginal noise removal of document images. Pattern Recognition, 35(11), pp.2593–2611. 19. Feldbach, M. and Tonnies, K.D., 2001. Line detection and segmentation in historical church registers. Proceedings of Sixth International Conference on Document Analysis and Recognition. 20. Fletcher, L.A. and Kasturi, R., 1988. Robust algorithm for text string separation from mixed text/graphics images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(6), pp.910–918. 21. Gaceb, D., Lebourgeois, F. and Duong, J., 2013. Adaptative smart-binarization method: For images of business documents. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp.118–122. 22. Gatos, B., Pratikakis, I. and Perantonis, S.J., 2006. Adaptive degraded document image binarization. Pattern Recognition, 39(3), pp.317–327. 23. Hough, P.V.C., 1962. Method and means for recognizing complex patterns, 24. Kennard, D.J. and Barrett, W. a., 2006. Separating lines of text in free-form handwritten historical documents. Proceedings - Second International Conference on Document Image Analysis for Libraries, DIAL 2006, 2006, pp.12–23. 25. Khattab, D., Theobalt, C., Hussein, A.S. and Tolba, M.F., 2014. Modified GrabCut for human face segmentation. Ain Shams Engineering Journal, 5(4), pp.1083–1091. 26. Kise, K., Sato, A. and Iwata, M., 1998. Segmentation of Page Images Using the Area Voronoi Diagram. Computer Vision and Image Understanding, 70(3), pp.370–382. 27. Le, D.X., Thoma, G.R. and Wechsler, H., 1996. Automated borders detection and adaptive segmentation for binary document images. Proceedings - International Conference on Pattern Recognition, 3, pp.737–741. 28. Lemaitre, A. and Camillerapp, J., 2006. Text line extraction in handwritten document with Kalman Filter applied on low resolution image. Proceedings - Second International Conference on Document Image Analysis for Libraries, DIAL 2006, 2006, pp.38–45. 29. Li, Y., Zheng, Y., Doermann, D. and Jaeger, S., 2008. Script-independent text line segmentation in freestyle handwritten documents. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(8), pp.1313–1329. 30. Likforman-Sulem, L., Hanimyan, a. and Faure, C., 1995. A Hough based algorithm for extracting text lines in handwritten documents. Proceedings of 3rd International Conference on Document Analysis and Recognition, 2, pp.774–777. 31. Lings, M., 1998. The Quranic Art of Calligraphy and Illumination, 32. Louloudis, G., Gatos, B., Pratikakis, I. and Halatsis, C., 2008. Text line detection in handwritten documents. Pattern Recognition, 41(12), pp.3758–3772. 33. Makridis, M., Nikolaou, N. and Gatos, B., 2007. An efficient word segmentation technique for historical and degraded machine-printed documents. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, 1(Icdar), pp.178–182. 34. Manmatha, R. and Rothfeder, J.L., 2005. A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), pp.1212–1225. 35. Nafchi, H.Z., Moghaddam, R.F. and Cheriet, M., 2013. Application of phase-based features and denoising in postprocessing and binarization of historical document images. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp.220–224. 36. NAGY, G. and SETH, S., 1984. Hierarchical representation of optically scanned documents. Proceedings of International Conference on Pattern Recognition, pp.347–349. 37. Nasrudin, M.F., Omar, K., Choong-Yeun, L. and Zakaria, M.S., 2010. Pengecaman aksara jawi menggunakan jelmaan surih. Sains Malaysiana, 39(2), pp.291–297. 38. Nicolaou, a. and Gatos, B., 2009. Handwritten text line segmentation by shredding text into its lines. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp.626–630. 39. Nicolas, S., Paquet, T. and Heurte, L., 2004. Text line segmentation in handwritten document using a production system. Proceedings - International Workshop on Frontiers in Handwriting Recognition, IWFHR, pp.245–250. 40. Ntirogiannis, K., Gatos, B. and Pratikakis, I., 2013. Performance evaluation methodology for historical document image binarization. IEEE Transactions on Image Processing, 22(2), pp.595–609. 41. O’Gorman, L., 1993. Document spectrum for page layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11), pp.1162–1173. 42. Omar, K., 2000. Pengecaman Tulisan Tangan Teks Jawi Menggunakan Penkelas Multiaras. Universiti Putra Malaysia. 43. Papavassiliou, V., Simistira, F., Katsouros, V. and Carayannis, G., 2012. A morphology based approach for binarization of handwritten documents. Proceedings - International Workshop on Frontiers in Handwriting Recognition, IWFHR, pp.577–581. 44. Parker, J., Frieder, O. and Frieder, G., 2013. Automatic enhancement and binarization of degraded document images. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp.210–214. 45. Patvardhan, C., K. Verma, a. and V. Lakshmi, C., 2012. Denoising of Document Images using Discrete Curvelet Transform for OCR Applications. International Journal of Computer Applications, 55(10), pp.20–27. 46. Phillips, P., McCabe, R. and Chellappa, R., 1998. Biometric image processing and recognition. European Signal Processing Conference. 47. Pu, Y. and Shi, Z., 1998. A natural learning algorithm based on hough transform for text lines extraction in handwritten documents. Proceedings of the 6th International Workshop on Frontiers in Handwriting Recognition, pp.637–646. 48. Qadir, M.A. and Ahmad, I., 2006. Digital text watermarking: Secure content delivery and data hiding in digital documents. IEEE Aerospace and Electronic Systems Magazine, 21(11), pp.18–21. 49. Rabeux, V., Journet, N., Vialard, A. and Domenger, J.P., 2013. Quality evaluation of ancient digitized documents for binarization prediction. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp.113–117. 50. Rao, K.M.M., 2004. Overview of Image Processing. Proceedings of a workshop on image processing and pattern recognition, pp.1–7. 51. Roy, P., Pal, U. and Lladós, J., 2008. Morphology based handwritten line segmentation using foreground and background information. Conference on Frontiers in Handwriting , pp.5–10. 52. Saabni, R., Asi, A. and El-Sana, J., 2014. Text line extraction for historical document images. Pattern Recognition Letters, 35(1), pp.23–33. 53. Sauvola, J. and Pietikäinen, M., 2000. Adaptive document image binarization. Pattern Recognition, 33(2), pp.225–236. 54. Sehad, A., Chibani, Y., Cheriet, M. and Yaddaden, Y., 2013. Ancient degraded document image binarization based on texture features. , (Ispa), pp.182–186. 55. Shi, Z. and Govindaraju, V.G.V., 2004. Line separation for complex document images using fuzzy runlength. First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings. 56. Shi, Z.S.Z., Setlur, S. and Govindaraju, V., 2005. Text extraction from gray scale historical document images using adaptive local connectivity map. Eighth International Conference on Document Analysis and Recognition (ICDAR’05). 57. Stamatopoulos, N., Gatos, B. and Perantonis, S.J., 2009. A method for combining complementary techniques for document image segmentation. Pattern Recognition, 42(12), pp.3158–3168. 58. Stathis, P., Kavallieratou, E. and Papamarkos, N., 2008. An evaluation survey of binarization algorithms on historical documents. 2008 19th International Conference on Pattern Recognition, pp.2–5. 59. Su, B., Lu, S. and Tan, C., 2012. A learning framework for degraded document image binarization using Markov random field. Pattern Recognition (ICPR), 2012 21st , (Icpr), pp.13–16. 60. Su, B., Lu, S. and Tan, C.L., 2011. Combination of document image binarization techniques. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. 2011 pp. 22–26. 61. Tajabadi, R., Mashayekhi, K. and Shabani, S., 2009. Illumination position in the growth of Islamic Art. Paper presented at the first national conference on Shiite arts. 62. Wagdy, M., Faye, I. and Rohaya, D., 2013. Fast and Efficient Document Image Clean Up and Binarization Based on Retinex Theory. , pp.8–10. 63. Wahl, F.M., Wong, K.Y. and Casey, R.G., 1982. Block segmentation and text extraction in mixed text/image documents. Computer Graphics and Image Processing, 19(1), p.94. 64. Weliwitage, C., Harvey, A.L. and Jennings, A.B., 2005. Handwritten document offline text line segmentation. Proceedings of the Digital Imaging Computing: Techniques and Applications, DICTA 2005. 2005 pp. 184–187. 65. Yin, F. and Liu, C.L., 2008. Handwritten text line extraction based on minimum spanning tree clustering. Wavelet Analysis and Pattern Recognition, 2007. ICWAPR’07. International Conference on, 3, pp.1123–1128.