Principal component analysis dimensionality reduction for writer verification

Writer verification (WV) is a process to verify whether two sample handwritten document are written by the same writer or not. WV also known as one to one comparison process, where the process is more specific which compare one writer to another writer. Therefore, this process needs a unique charact...

Full description

Saved in:
Bibliographic Details
Main Author: Ramlee, Rimashadira
Format: Thesis
Language:English
English
Published: 2015
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/16858/1/Principal%20Component%20Analysis%20Dimensionality%20Reduction%20For%20Writer%20Verification.pdf
http://eprints.utem.edu.my/id/eprint/16858/2/Principal%20component%20analysis%20dimensionality%20reduction%20for%20writer%20verification.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utem-ep.16858
record_format uketd_dc
institution Universiti Teknikal Malaysia Melaka
collection UTeM Repository
language English
English
advisor Muda, Azah Kamilah

topic H Social Sciences (General)
H Social Sciences (General)
spellingShingle H Social Sciences (General)
H Social Sciences (General)
Ramlee, Rimashadira
Principal component analysis dimensionality reduction for writer verification
description Writer verification (WV) is a process to verify whether two sample handwritten document are written by the same writer or not. WV also known as one to one comparison process, where the process is more specific which compare one writer to another writer. Therefore, this process needs a unique characteristic of the writer in order to prove the owner of the handwritten document. Basically, different person will have different type of handwriting styles usually it is unique between each other. Furthermore, most of the previous research in handwriting analysis field was used the unique characteristic to represent the individuality of handwriting. A part from that, individuality of handwriting became main issue in this study in order to fulfill requirement of WV process. In previous verification framework of WV the individuality of handwriting was acquired by using feature extraction process. Meanwhile, previous verification framework of WV consists of Preprocessing task, feature extraction task and classification task. In this study, using the previous verification framework are not enough to produce the best result in verification process. This is because the quality of individuality of handwriting that has been acquired is less effective in representing the uniqueness of the writer. Therefore, this study was proposed Dimension reduction technique for acquiring the individual features of the handwritten data henceforth improved the previous verification’s framework in order to enhance the verification accuracy. The sample data was taken from IAM online database which this database is the benchmark for handwriting analysis research. Five writers with 3619 instance of images are chosen for the experiment whereas 9 documents of handwriting samples are taken from each writer and more than 50 word randomly divided into training and testing dataset. Both dataset is will be process by Principal Component Analysis which is one of the dimension reduction techniques. PCA was applied after feature extraction process whereas the reduction process will resulted low dimensional of new subspace of data. By using the data resulted by PCA the classification process by random forest was conducted in order to verify the writer of the handwritten document. The individuality representation is implemented by presenting various representations of individual feature into more important feature are selected by using the proposed technique to be used in verifying the writer. Experimental show that the performance of the proposed methods has improved the verification rate of 90.00 % and above overall of the result with the reduction is successful in each data set. However, overall of the result the improved framework still cannot verify 100 % accurately the writer of the handwritten data.
format Thesis
qualification_name Master of Philosophy (M.Phil.)
qualification_level Master's degree
author Ramlee, Rimashadira
author_facet Ramlee, Rimashadira
author_sort Ramlee, Rimashadira
title Principal component analysis dimensionality reduction for writer verification
title_short Principal component analysis dimensionality reduction for writer verification
title_full Principal component analysis dimensionality reduction for writer verification
title_fullStr Principal component analysis dimensionality reduction for writer verification
title_full_unstemmed Principal component analysis dimensionality reduction for writer verification
title_sort principal component analysis dimensionality reduction for writer verification
granting_institution Universiti Teknikal Malaysia Melaka
granting_department Faculty Of Information And Communication Technology
publishDate 2015
url http://eprints.utem.edu.my/id/eprint/16858/1/Principal%20Component%20Analysis%20Dimensionality%20Reduction%20For%20Writer%20Verification.pdf
http://eprints.utem.edu.my/id/eprint/16858/2/Principal%20component%20analysis%20dimensionality%20reduction%20for%20writer%20verification.pdf
_version_ 1747833901738360832
spelling my-utem-ep.168582022-05-17T15:42:22Z Principal component analysis dimensionality reduction for writer verification 2015 Ramlee, Rimashadira H Social Sciences (General) HV Social pathology. Social and public welfare Writer verification (WV) is a process to verify whether two sample handwritten document are written by the same writer or not. WV also known as one to one comparison process, where the process is more specific which compare one writer to another writer. Therefore, this process needs a unique characteristic of the writer in order to prove the owner of the handwritten document. Basically, different person will have different type of handwriting styles usually it is unique between each other. Furthermore, most of the previous research in handwriting analysis field was used the unique characteristic to represent the individuality of handwriting. A part from that, individuality of handwriting became main issue in this study in order to fulfill requirement of WV process. In previous verification framework of WV the individuality of handwriting was acquired by using feature extraction process. Meanwhile, previous verification framework of WV consists of Preprocessing task, feature extraction task and classification task. In this study, using the previous verification framework are not enough to produce the best result in verification process. This is because the quality of individuality of handwriting that has been acquired is less effective in representing the uniqueness of the writer. Therefore, this study was proposed Dimension reduction technique for acquiring the individual features of the handwritten data henceforth improved the previous verification’s framework in order to enhance the verification accuracy. The sample data was taken from IAM online database which this database is the benchmark for handwriting analysis research. Five writers with 3619 instance of images are chosen for the experiment whereas 9 documents of handwriting samples are taken from each writer and more than 50 word randomly divided into training and testing dataset. Both dataset is will be process by Principal Component Analysis which is one of the dimension reduction techniques. PCA was applied after feature extraction process whereas the reduction process will resulted low dimensional of new subspace of data. By using the data resulted by PCA the classification process by random forest was conducted in order to verify the writer of the handwritten document. The individuality representation is implemented by presenting various representations of individual feature into more important feature are selected by using the proposed technique to be used in verifying the writer. Experimental show that the performance of the proposed methods has improved the verification rate of 90.00 % and above overall of the result with the reduction is successful in each data set. However, overall of the result the improved framework still cannot verify 100 % accurately the writer of the handwritten data. 2015 Thesis http://eprints.utem.edu.my/id/eprint/16858/ http://eprints.utem.edu.my/id/eprint/16858/1/Principal%20Component%20Analysis%20Dimensionality%20Reduction%20For%20Writer%20Verification.pdf text en public http://eprints.utem.edu.my/id/eprint/16858/2/Principal%20component%20analysis%20dimensionality%20reduction%20for%20writer%20verification.pdf text en validuser https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=96165 mphil masters Universiti Teknikal Malaysia Melaka Faculty Of Information And Communication Technology Muda, Azah Kamilah 1. Abdi, H. and Wiliam, L. J., 2010. Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics. Wiley Online Library. 2. Abdi, H. and Williams, L. J., 2010. Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics. Wiley Online Library. 3. Aparna, R., Bincy, G. and Mathu, T., 2012. Survey on Common Data Mining Classification Techniques. International journal of Wisdom Based Computing, 2(1). 4. Archana, S. and Elangovan, K., 2014. Survey of Classification Techniques in Data Mining. internation Journal of Computer Science and Mobile Application, 2 (2), pp. 65-71. 5. Balakrishnama, S. and Ganapathiraju, A., 2010. Linear Discriminant Analysis - A Brief Toturial. 6. Bellman, R., 1961. Adaptive Control Processes: A Guideed Tour. In: Princeton University Press, P. (ed.). 7. Bensefia, A., Thierry, P. and Laurent, H., 2004. Handwriting Analysis for Writer Verification. Proceedings of the 9th International Workshop on Frontier in Handwriting Recognition. IEEE. 8. Bensefia, A., Thierry, P. and Laurent, H., 2005. A writer identification and verification system. Pattern Recognition Letters, 26 (13), pp. 2080-2092. 9. Breiman, L., 2001. Random Forest. Kluwer Academic Publishers, Machine Learning, 45, pp. 5 - 32. 10. Cha, S. H. and Srihari, S. N., 2000. Multiple Feature Intergration for Writer Verification. Proc. 7th. Workshop on Frontiers in Handwriting Recognition. 11. Cunningham, P., 2007. Dimension Reduction. Dublin: University College Dublin. 12. Deepthy, K. D. and Anita, J., 2012. Survey on Data Mining Teachniques to Enhance Instrusion Detection. International Conference on Computer Communication and Informatics. Coimbatore, India, IEEE. 13. Fernandes, S. and Bala, J., 2013. Performance Analysis of PCA-based and LDA-based Algorithms for Face Recognition. Internation Journal of Signal Processing System, 1. 14. Fukunaga, K., 1990. Introduction to Statistical Pattern Recognition. Academic Press Professional. San Diego, CA, essentialUSA. 15. Hall, M., Eibe, F., Geoffrey, H. and Bernhard, P., 2009. The WEKA Data Mining Software: An Update. 10 -18. 16. Harold, H., 1933. Analysis of a Complex of Statistical Variables into Proncipal Components. Journal of Education Psychology. 17. He, Z., Tang, Y. and Youp, X., 2008. Writer Identification using Global Wavelet-based Features. Neurocomputing, 71, pp. 1832-1841. 18. Huber, R. A. and Headrick, A. M., 1999. Handwriting Identication: Facts and Fundamentals. CRC Press. 19. Isaacs, J. C., Foo, S. Y. and Baese, A. M., 2007. Novel Kernels and Kernel PCA for Pattern Recognition. International Symposium on Computational Intelligence in Robotic and Automation. USA, IEEE. 20. Ji, S. and Ye, J., 2009. Linear Dimensionality Reduction for Multi-label Classification. International Joint Conferences on Artificial Intelligence, pp. 1077 - 1082. 21. Jolliffe, I. T., 1986. Principal Component Analysis. New York: Springer. 22. Kaiser, H. F., 1960. The application of electronic computers to factor analysis. Educ. Psychol. Meas. 23. Karl, P., 1901. On lines and planes of closest fit to points in space. Philosophical Magazine. 24. Koren, Y. and Liran, C., 2004. Robust linear dimensionality reduction. IEEE Transactions on Visualization and Computer Graphics. IEEE Computer Society. 25. Kudo, M., 2000. Comparison of Algorithm that Select Features for Pattern Classifiers. Journal of Pattern Recognition, 33, pp. 25-41. 26. Kumar, C. A., 2009. Analysis of unsupervised dimensionality reduction techniques. Computer Sciences and Information System, 6(2), pp. 217-227. 27. Kumar, S. and Chauhan, A., 2014. A Survey on Image Feature Selection Techniques. Internation journal of Computer Science and Information Technology, 5 (5), pp. 6449 - 6452. 28. Li, C., Diao, Y., Ma, H. and Li, Y., 2008. A Statistical PCA Method for Face Recognition. Journal in Intelligent Information Technology Application, pp. 376-380. 29. Li, H., Jiang, T. and Zhang, K., 2006. Efficient and Robust Feature Extraction by Maximum Margin Criterion. IEEE Transactions On Neural Networks. IEEE. 30. Maaten, L. J. P. v. d., Postma, E. O. and Herik, H. J. v. d., 2008. Dimensionality Reduction: A Comparative Review. Journal of Machine Learning Research. 31. Markus, I., Clement, A. and Tatjana, K., 2012. Tree Species Classification with Random Forest Using High Spatial Resolution 8-Band World View-2 Satellite Data. Remote Sensing, 4, pp. 22661-22693. 32. Marti, U. and Bunke, H., 2000. Handwritten Sentence Recognition. Proc. of the 15th Int. Conf. on Pattern Recognition. IEEE. 33. Marti, U. and Bunke, H., 2002. The IAM-Database: An English Sentence Database for Off-line Handwriting Recognition. Internation Journal on Document Analysis and Recognition, 5, pp. 39-46. 34. Martinez, A. M. and Kak, A. C., 2001. PCA Versus LDA. IEEE Transaction On Pattern Analysis and Machine Intelligence. IEEE. 35. Masaeli, M., Fung, G. and Jennifer, G., 2010. From Transformation-Based Dimensionality Reduction to Feature Selection. Proc IEEE, 27th International Conference on Machine Learning. Boston, USA, IEEE. 36. Meng, J. and Yang, Y., 2012. Symmetrical Two-Dimensional PCA with Image Measure in Face Recognition. International Journal of Advance Robotic System, 9. 37. Miguel, A. and Perpinan, C., 1997. A review of Dimension Reduction Techniques. Sheffeild: Dept.of Computer Science, University of Sheffield, CS-96-09. 38. Morris, R. and Morris, R. N., 2000. Forensic Handwriting Identification: Fundamental Concepts and Principles. Academic Press. 39. Muda, A. K., 2009. Authorship Invarianceness for Writer Identification Using Invariant Discretization and Modified Immune Classifier. Johor: Universiti Teknologi Malaysia. 40. Muda, A. K., Mariyam, S. and Maslina, D., 2008. Invariants Discretization for Individuality Representation in Handwritten Authorship. 2nd Internation Workshop on Computational Forensic. Washington: DC : Springer-Verlag. 41. Mukundan, R., 1998. Moment Functions in in Image Analysis Theory and Application. Singapore : World Scientific Publishing Co.Pte.Ltd. 42. Pankaj, D. S. and Wilscy, M., 2013. Comparison of PCA, LDA and Gabor Features for Face Recognition Using Fuzzy Neural Network. Conference in Advances in Computing and Infor-mation Technology. Springer. 43. Partridge, M. and Jabri, M., 2000. Robust Principal Component Analysis. Proceedings of the 2000 IEEE Signal Processing Society Workshop. Sydney: IEEE. 44. Pechenizkiy, M. and Puuronen, S., 2004. PCA-based feature transformation for classification: issues in medical diagnostics. Proceedings. 17th IEEE Symposium on Computer-Based Medical Systems. IEEE. 45. Phillips, P. J., Flynn, P. J., Scruggs, T., Bowyer, K. W., Chang, J., Hoffman, K., Marques, J., Min, J. and Worek, W., 2005. Overview of the Face Recognition Grand Challenge. Conference on Computer vision and pattern recognition. IEEE. 46. Plamondon, R. and Lorette, G., 1989. Automatic Signiture Verification and Writer Identification - The State of the Art. 47. Prasad, S. and Sapre, A., 2010. Handwriting Analysis Based on Segmentation Method for Prediction of Human Personality Using Support Vector Machine. International Journal of Computer Application, 8. 48. Ranshul, C., Prabhdeep, S. and Rajiv, M., 2014. A Survey On Data Mining Techniques. International Journal of Advance Research in Computer and Communication Engineering, 3 (1). 49. Sasan, K., Shahidan, M. A., Azizah, A. M. and Mazdak, Z., 2013. An Overviewf of Principal Component Analysis. Journal of Signal and Information Processing, Scientific Research, 4, pp. 173-175. 50. Schlapbach, A. and Bunke, H., 2007. A Writer Identification and Verification System Using HMM Based Recognizer. Pattern Analysis and Applications, pp. 33-34. Springer. 51. Schlapbach, A. and Bunke, H., 2008. Off-lineWriter Identification and Verification Using Gaussian Mixture Models. In: S.Marinai, H. F. (ed.) Machine Learning in Document Analysis and Recognition, pp. 409-428. Springer. 52. Schomaker, L., 2008. Writer identification and verification. 53. Shah, J. H., Sharif, M., Raza, M. and Azeem, A., 2013. A Survey: Linear and Nonlinear PCA Based Face Recognition Techniques. International Arab Journal of Information Technology, 10 (6). 54. Srihari, S. N., 2002. Individuality of Handwriting. Journal of Forensic Sciences, pp. 856-872. 55. Srihari, S. N., Huang, C., Srinivasan, H. and Shah, V. A., 2006. Biometric and Forensic Aspect of Digital Document Processing. Digital Document Processing. Springer. 56. Srihari, S. N. and Shi, Z., 2004. Forensic handwritten document retrieval system. Proc. of the First International Workshop on Document Image Analysis for Libraries. Washington, DC, USA, IEEE Computer Society. 57. Wang, X. and Ding, X., 2004. An effective writer verification algorithm using negative samples. Ninth International Workshop on Frontiers in Handwriting Recognition. . IEEE. 58. Yinan, S. and Wang, Y., 2003. United moment invariants for shape discrimination. Proceedings on IEEE International Conference Robotics, Intelligent Systems and Signal Processing., IEEE. 59. Zhang, B. and Srihari, S. N., 2003a. Analysis of handwriting individuality using word features. Proceedings. Seventh International Conference on Document Analysis and Recognition. IEEE. 60. Zhang, B., Srihari, S. N. and Lee, S., 2003b. Individuality of Handwritten Characters. Proc.7th.Conference on Document Analysis and Recognition. 61. Zhao, N., Washington, M. and Xiuwen, L., 2011. A Hybrid PCA-LDA Model for Dimension Reduction. International Joint Conference on Neural Network. San Jose, Calofornia, USA, IEEE. 62. Zhu, M., 2004. A Simple Technique for Automatically Selecting the Number of Principal Components via the Used of Profile Likelihood. Working Paper. Department of Statistic & Actuarial Science, University of Waterloo Canada. 63. Zhu, M. and Ghodsi, A., 2006. Automatic Dimensionality Selection from the Scree Plot via the Use of Profile Likelihood. Computational Statistics & Data Analysis, 51, pp. 918 - 930. 64. Zimmermann, M. and Bunke, H., 2000. Automatic Segmentation of the IAM Off-line Database for Handwritten English Text. Proc. of the 16th Int. Conf. on Pattern Recognition. 65. Zois, E. N. and Anastassopoulos, V., 2001. Fusion of Correlated Decisions for Writer Verification. Journal of Pattern Recognition, 34, pp. 47-61.