A Study of Automated Essay Scoring Frameworks on Evaluating Malaysian University English Test Essays Based on Syntactic and Semantic Features

An Automated Essay Scoring (AES) system can use a trained computational model to evaluate an essay as close to the grade that a human rater would assign. The purpose of this study is to examine the performance of different machine learning methods in predicting Malaysian University English Test (MUE...

Full description

Saved in:
Bibliographic Details
Main Author: Chun Then, Lim
Format: Thesis
Language:English
English
English
Published: 2023
Subjects:
Online Access:http://ir.unimas.my/id/eprint/42024/5/Final%20Submission%20of%20Thesis%20Form%20%28Lim%20Chun%20Then%29.pdf
http://ir.unimas.my/id/eprint/42024/6/LIM%20Chun%20Then_Master_24pages.pdf
http://ir.unimas.my/id/eprint/42024/9/Lim%20CT.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-unimas-ir.42024
record_format uketd_dc
institution Universiti Malaysia Sarawak
collection UNIMAS Institutional Repository
language English
English
English
topic QA76 Computer software
spellingShingle QA76 Computer software
Chun Then, Lim
A Study of Automated Essay Scoring Frameworks on Evaluating Malaysian University English Test Essays Based on Syntactic and Semantic Features
description An Automated Essay Scoring (AES) system can use a trained computational model to evaluate an essay as close to the grade that a human rater would assign. The purpose of this study is to examine the performance of different machine learning methods in predicting Malaysian University English Test (MUET) essay grade based on syntactic features and semantic features and generalize frameworks accordingly. Based on the results, we found that syntactic features of an essay have a higher effect than semantic features towards essay grades. Besides, we also found that the differences between machine learning and deep learning algorithms were not obvious, and neither algorithm's performance can be considered excellent because the quadratically weighted Kappa (QWK) scores were less than 0.75. Instead of using any available public essay datasets, five MUET essay datasets were collected locally for this study, and we found that all datasets suffer from imbalanced grade distribution. Therefore, QWK score is preferred over accuracy as the standard evaluation metric for AES because it provides more valuable information when dealing with imbalanced datasets. To overcome the problem of imbalanced grade distribution, a resampling method called Synthetic Minority Oversampling Technique (SMOTE) is applied to the dataset to study the impact of the resampling method on the performance of the AES framework. However, the SMOTE resampling method has been found to degrade predictive model accuracy and QWK scores. In addition, this study also developed an e-learning platform called UNIMAS DBRater, which is currently used by UNIMAS pre-university English classes, and more and more local educational institutions have expressed interest and willingness to join this e-learning platform.
format Thesis
qualification_level Master's degree
author Chun Then, Lim
author_facet Chun Then, Lim
author_sort Chun Then, Lim
title A Study of Automated Essay Scoring Frameworks on Evaluating Malaysian University English Test Essays Based on Syntactic and Semantic Features
title_short A Study of Automated Essay Scoring Frameworks on Evaluating Malaysian University English Test Essays Based on Syntactic and Semantic Features
title_full A Study of Automated Essay Scoring Frameworks on Evaluating Malaysian University English Test Essays Based on Syntactic and Semantic Features
title_fullStr A Study of Automated Essay Scoring Frameworks on Evaluating Malaysian University English Test Essays Based on Syntactic and Semantic Features
title_full_unstemmed A Study of Automated Essay Scoring Frameworks on Evaluating Malaysian University English Test Essays Based on Syntactic and Semantic Features
title_sort study of automated essay scoring frameworks on evaluating malaysian university english test essays based on syntactic and semantic features
granting_institution Universiti Malaysia Sarawak
granting_department Faculty of Computer Science & Information Technology
publishDate 2023
url http://ir.unimas.my/id/eprint/42024/5/Final%20Submission%20of%20Thesis%20Form%20%28Lim%20Chun%20Then%29.pdf
http://ir.unimas.my/id/eprint/42024/6/LIM%20Chun%20Then_Master_24pages.pdf
http://ir.unimas.my/id/eprint/42024/9/Lim%20CT.pdf
_version_ 1783728533974024192
spelling my-unimas-ir.420242023-09-08T02:16:50Z A Study of Automated Essay Scoring Frameworks on Evaluating Malaysian University English Test Essays Based on Syntactic and Semantic Features 2023-06-21 Chun Then, Lim QA76 Computer software An Automated Essay Scoring (AES) system can use a trained computational model to evaluate an essay as close to the grade that a human rater would assign. The purpose of this study is to examine the performance of different machine learning methods in predicting Malaysian University English Test (MUET) essay grade based on syntactic features and semantic features and generalize frameworks accordingly. Based on the results, we found that syntactic features of an essay have a higher effect than semantic features towards essay grades. Besides, we also found that the differences between machine learning and deep learning algorithms were not obvious, and neither algorithm's performance can be considered excellent because the quadratically weighted Kappa (QWK) scores were less than 0.75. Instead of using any available public essay datasets, five MUET essay datasets were collected locally for this study, and we found that all datasets suffer from imbalanced grade distribution. Therefore, QWK score is preferred over accuracy as the standard evaluation metric for AES because it provides more valuable information when dealing with imbalanced datasets. To overcome the problem of imbalanced grade distribution, a resampling method called Synthetic Minority Oversampling Technique (SMOTE) is applied to the dataset to study the impact of the resampling method on the performance of the AES framework. However, the SMOTE resampling method has been found to degrade predictive model accuracy and QWK scores. In addition, this study also developed an e-learning platform called UNIMAS DBRater, which is currently used by UNIMAS pre-university English classes, and more and more local educational institutions have expressed interest and willingness to join this e-learning platform. Universiti Malaysia Sarawak 2023-06 Thesis http://ir.unimas.my/id/eprint/42024/ http://ir.unimas.my/id/eprint/42024/5/Final%20Submission%20of%20Thesis%20Form%20%28Lim%20Chun%20Then%29.pdf text en staffonly http://ir.unimas.my/id/eprint/42024/6/LIM%20Chun%20Then_Master_24pages.pdf text en public http://ir.unimas.my/id/eprint/42024/9/Lim%20CT.pdf text en validuser masters Universiti Malaysia Sarawak Faculty of Computer Science & Information Technology Universiti Malaysia Sarawak (UNIMAS), Prototype Research Grant Scheme [#F04/PRGS/1801/2019, 2019] Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017). Understanding of a convolutional neural network. 2017 International Conference on Engineering and Technology, 1-6. doi:10.1109/ICEngTechnol.2017.8308186 Alghamdi, M., Alkanhal, M., Al-Badrashiny, M., Al-Qabbany, A., Areshey, A., & Alharbi, A. (2014). A hybrid automatic scoring system for Arabic essays. AI Communications, 27(2), 103–111. doi:10.3233/aic-130586 Algorithmia. (15 September, 2020). A Comprehensive Overview of Natural Language Processing by the Algorithmia team. Retrieved from AI Infrastructure Alliance: https://ai-infrastructure.org/a-comprehensive-overview-of-natural-language-processing-by-the-algorithmia-team/ Alikaniotis, D., Yannakoudakis, H., & Rei, M. (2016). Automatic Text Scoring Using Neural Networks. Al-Jouie, M., & Azmi, A. (2017). Automated Evaluation of School Children Essays in Arabic. Procedia Computer Science, 117, 19-22. doi:10.1016/j.procs.2017.10.089 Amalia, A., Gunawan, D., Fithri, Y., & Aulia, I. (2019). Automated Bahasa Indonesia essay evaluation with latent semantic analysis. Journal of Physics: Conference Series, 1235, 012100. doi:10.1088/1742-6596/1235/1/012100 Arc. (25 December, 2018). Convolutional Neural Network. Retrieved from Towards Data Science: https://towardsdatascience.com/convolutional-neural-network-17fb77e76c05#:~:text=Fully%20Connected%20Layer%20is%20simply,into%20the%20fully%20connected%20layer Attali, Y., & Burstein, J. (2006). Automated Essay Scoring With e-rater® V.2. Journal of Technology, Learning, and Assessment, 4(3). Awaida, S. a., Shargabi, B. A., & Rousan, T. A. (2019). Automated Arabic Essays Grading System based on F-Score and Arabic WordNet. Jordanian Journal of Computers and Information Technology, 5(3), 170-180. doi:10.5455/jjcit.71-1559909066 Beseiso, M., & Alzahrani, S. (2020). An Empirical Analysis of BERT Embedding for Automated Essay Scoring. International Journal of Advanced Computer Science and Applications, 11. doi:10.14569/IJACSA.2020.0111027 Blagus, R., & Lusa, L. (2013). SMOTE for high-dimensional class-imbalanced data. BMC Bioinformatics, 14, 106. doi:https://doi.org/10.1186/1471-2105-14-106 Borade, J. G., & Netak, L. D. (2021). Automated Grading of Essays: A Review. Intelligent human computer, 12615, 238–249. Brownlee, J. (24 May, 2017). A Gentle Introduction to Long Short-Term Memory Networks by the Experts. Retrieved from Machine Learning Mastery: https://machinelearningmastery.com/gentle-introduction-long-short-term-memory-networks-experts/ Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357. Chen, H., & He, B. (2013). Automated essay scoring by maximizing human-machine agreement. EMNLP 2013 - 2013 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 1741-1752. Chen, M., & Li, X. (2018). Relevance-Based Automated Essay Scoring via Hierarchical Recurrent Model. 2018 International Conference on Asian Language Processing. doi:10.1109/ialp.2018.8629256 Chen, Z., & Zhou, Y. (2019). Research on Automatic Essay Scoring of Composition Based on CNN and OR. 2019 2nd International Conference on Artificial Intelligence and Big Data. doi:10.1109/icaibd.2019.8837007 Cheon, M., Seo, H.-W., Kim, J.-H., Noh, E.-H., Sung, K.-H., & Lim, E. (2015). An Automated Scoring Tool for Korean Short-Answer Questions Based on Semi-Supervised Learning. Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications. doi:10.18653/v1/w15-4409 Contreras, J. O., Hilles, S., & Abubakar, Z. B. (2018). Automated Essay Scoring with Ontology based on Text Mining and NLTK tools. 2018 International Conference on Smart Computing and Electronic Enterprise. doi:10.1109/icscee.2018.8538399 Dickson, B. (6 January, 2020). What are convolutional neural networks (CNN)? Retrieved from BD Tech Talks: https://bdtechtalks.com/2020/01/06/convolutional-neural-networks-cnn-convnets/ Dikli, S. (2006). An Overview of Automated Scoring of Essays. Journal of Technology, Learning, and Assessment, 5(1), 1-35. Dikli, S. (2010). The Nature of Automated Essay Scoring Feedback. CALICO Journal, 28. doi:10.11139/cj.28.1.99-134 Dong, F., & Zhang, Y. (2016). Automatic Features for Essay Scoring – An Empirical Study. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. doi:10.18653/v1/D16-1115 Elalfi, A. E., Elgamal, A. F., & Amasha, N. A. (December, 2019). Automated Essay Scoring using Word2vec and Support Vector Machine. International Journal of Computer Applications, 177, 20-29. doi:10.5120/ijca2019919707 Elliott, S. M. (2003). IntelliMetric: from here to validity. In Automated essay scoring: a cross-disciplinary perspective (pp. 71-86). Fazal, A., Dillon, T., & Chang, E. (2011). Noise Reduction in Essay Datasets for Automated Essay Grading. On the Move to Meaningful Internet Systems: OTM 2011 Workshops Lecture Notes in Computer Science, 484-493. doi:10.1007/978-3-642-25126-9_60 Filho, A. H., Concatto, F., Nau, J., Prado, H. A., Imhof, D. O., & Ferneda, E. (2019). Imbalanced Learning Techniques for Improving the Performance of Statistical Models in Automated Essay Scoring. Procedia Computer Science, 159, 764-773. doi:https://doi.org/10.1016/j.procs.2019.09.235 Filho, A. H., Concatto, F., Prado, H. A., & Ferneda, E. (2021). Comparing Feature Engineering and Deep Learning Methods. Proceedings of the 23rd International Conference on Enterprise Information Systems, 1, 575-583. doi:10.5220/0010377505750583 Foltz, P., Laham, D., & Landauer, T. (1999). The Intelligent Essay Assessor: Applications to Educational Technology. Interactive Multimedia Electronic Journal of Computer-Enhanced Learning, 27-31. Garbade, M. J. (15 October, 2018). A Simple Introduction to Natural Language Processing. Retrieved from https://becominghuman.ai/a-simple-introduction-to-natural-language-processing-ea66a1747b32 Gers, F., Schmidhuber, J., & Cummins, F. (1999). Learning to forget: continual prediction with LSTM. Ninth International Conference on Artificial Neural Networks. 2, pp. 850-855. Edinburgh: Institution of Engineering and Technology. doi:10.1049/cp:19991218 Ghanta, H. (2019). Automated Essay Evaluation Using Natural Language Processing and Machine Learning. Columbus State University, TSYS School of Computer Science. Columbus: Theses and Dissertations. Retrieved from https://csuepress.columbusstate.edu/theses_dissertations/327 Ghosh, S. F. (2008). Design of an Automated Essay Grading (AEG) system in Indian context. TENCON 2008 - 2008 IEEE Region 10 Conference. doi:10.1109/tencon.2008.4766677 Greene, P. (2 July, 2018). Automated Essay Scoring Remains An Empty Dream. Retrieved from Forbes: https://www.forbes.com/sites/petergreene/2018/07/02/automated-essay-scoring-remains-an-empty-dream/#4474e4f74b91 Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. (2017). LSTM: A Search Space Odyssey. IEEE Transactions on Neural Networks and Learning Systems, 28, 2222 - 2232. doi:10.1109/TNNLS.2016.2582924 Han, J., Kamber, M., & Pei, J. (2012). 2 - Getting to Know Your Data. In J. Han, M. Kamber, & J. Pei, Data Mining (Third Edition) (pp. 39-82). Boston: Morgan Kaufmann. doi:https://doi.org/10.1016/B978-0-12-381479-1.00002-2 Hruska, J. (15 February, 2018). MIT Neural Network Processor Cuts Power Consumption by 95 Percent. Retrieved from Extreme Tech: https://www.extremetech.com/computing/263951-mit-announces-new-neural-network-processor-cuts-power-consumption-95 Hussein, M. A., Hassan, H., & Nassef, M. (2019). Automated Language Essay Scoring Systems: A Literature Review. doi:10.7287/peerj.preprints.27715v1 IBM Cloud Education. (15 July, 2020). IBM Cloud Learn Hub: What is Machine Learning? Retrieved from IBM: https://www.ibm.com/cloud/learn/machine-learning Imaki, J., & Ishihara, S. (2013). Experimenting with a Japanese automated essay scoring system in the L2 Japanese environment. Papers in Language Testing and Assessment, 2(2), 28-47. Intelligent Essay Assessor (IEA)™ Fact Sheet. (2010). Retrieved 4 June, 2020, from Pearson Education: https://images.pearsonassessments.com/images/assets/kt/download/IEA-FactSheet-20100401.pdf Ishioka, T., & Kameda, M. (July, 2006). Automated Japanese Essay Scoring System based on Articles Written by Experts. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, 233–240. doi:10.3115/1220175.1220205 Islam, M. M., & Hoque, A. S. (2013). Automated Bangla essay scoring system: ABESS. International Conference on Informatics, Electronics and Vision. doi:10.1109/iciev.2013.6572694 Jin, C., He, B., Hui, K., & Sun, L. (2018). TDNN: A Two-stage Deep Neural Network for Prompt-independent Automated Essay Scoring. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). doi:10.18653/v1/p18-1100 Ke, Q., Liu, J., Bennamoun, M., An, S., Sohel, F., & Boussaid, F. (2018). Chapter 5 - Computer Vision for Human–Machine Interaction. In M. Leo, & G. M. Farinella (Eds.), Computer Vision for Assistive Healthcare (pp. 127-145). Academic Press. doi:https://doi.org/10.1016/B978-0-12-813445-0.00005-8 Ke, Z., & Ng, V. (2019). Automated Essay Scoring: A Survey of the State of the Art. 6300-6308. doi:10.24963/ijcai.2019/879 Knoema. (2018). Sarawak - Number of students at government & government-aided secondary schools. Retrieved from Knoema: https://knoema.com/atlas/Malaysia/Sarawak/Number-of-students-at-government-secondary-schools Knoema. (2018). Sarawak - Number of teachers at government & government-aided secondary schools. Retrieved from Knoema: https://knoema.com/atlas/Malaysia/Sarawak/Number-of-teachers-at-government-secondary-schools Kobayashi, Y., & Abe, M. (2016). Automated Scoring of L2 Spoken English with Random Forests. Journal of Pan-Pacific Association of Applied Linguistics, 20, 55-73. Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Introduction to Latent Semantic Analysis. Discourse Processes, 25, 259-284. Landauer, T. K., Laham, D., & Foltz, P. W. (2000). The Intelligent Essay Assessor. In M. A. Hearst (Ed.), The debate on automated essay grading. IEEE Intelligent systems, 27–31. Liang, G., On, B.-W., Jeong, D., Kim, H.-C., & Choi, G. S. (2018). Automated Essay Scoring: A Siamese Bidirectional LSTM Neural Network Architecture. Symmetry, 10, 682. doi:10.3390/sym10120682 Lippmann, R. P. (April, 1987). An Introduction to Computing with Neural Nets. IEEE Acoustics, Speech, and Signal Processing Society Magazine, 4(2), 4-22. Loraksa, C., & Peachavanish, R. (2007). Automatic Thai-Language Essay Scoring Using Neural Network and Latent Semantic Analysis. First Asia International Conference on Modelling & Simulation (AMS07). doi:10.1109/ams.2007.19 Madasamy, K., & Ramaswami, M. (2017). Data Imbalance and Classifiers: Impact and Solutions. International Journal of Computational Intelligence Research, 13(9), 2267-2281. Retrieved from https://www.ripublication.com/ijcir17/ijcirv13n9_09.pdf Malaysian Examinations Council. (31 December, 2014). MUET Regulations and Test Specifications. Retrieved 15 March, 2021, from Malaysian Examination Councils Official Portal: https://www.mpm.edu.my/images/dokumen/calon-peperiksaan/muet/regulation/Regulations_Test_Specifications_Test_Format_and_Sample_Questions.pdf Malaysian Examinations Council. (18 July, 2019). MUET Specification Regulation. Retrieved from Malaysian Examination Councils Official Portal: https://www.mpm.edu.my/images/dokumen/calon-peperiksaan/muet/regulation/Test_Specification_Regulation.pdf Mathias, S., & Bhattacharyya, P. (May, 2018). ASAP++: Enriching the ASAP Automated Essay Grading Dataset with Essay Attribute Scores. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA). Retrieved from https://aclanthology.org/L18-1187 Mbaabu, O. (11 December, 2020). Introduction to Random Forest in Machine Learning. Retrieved from Section: https://www.section.io/engineering-education/introduction-to-random-forest-in-machine-learning/ Measurement Incorporated. (2020). Retrieved 4 June, 2020, from https://www.measurementinc.com/products-services/automated-essay-scoring Merriam-Webster. (n.d.). Algorithm. Retrieved 28 June, 2022, from Merriam-Webster.com dictionary: https://www.merriam-webster.com/dictionary/algorithm Mostafa, S., & Wu, F.-X. (2021). Chapter 3 - Diagnosis of autism spectrum disorder with convolutional autoencoder and structural MRI images. In A. S., El-Baz, & J. S. Suri (Eds.), Neural Engineering Techniques for Autism Spectrum Disorder (pp. 23-38). Academic Press. doi:https://doi.org/10.1016/B978-0-12-822822-7.00003-X Nguyen, H. (2016). Neural Networks for Automated Essay Grading. Omar, N., & Mezher, R. (2016). A Hybrid Method of Syntactic Feature and Latent Semantic Analysis for Automatic Arabic Essay Scoring. Journal of Applied Sciences, 16(5), 209–215. doi:10.3923/jas.2016.209.215 Ong, D. A., Razon, A. R., Perigrino, J. M., Guevara, R. C., & Naval, P. C. (2011). Automated Filipino Essay Grader with Concept-Indexing. Page, E. B. (1966). The Imminence of... Grading Essays by Computer. Phi Delta Kappan, 47(5), 238–243. Peng, X., Ke, D., Chen, Z., & Xu, B. (2010). Automated Chinese Essay Scoring using Vector Space Models. 2010 4th International Universal Communication Symposium, 149-153. doi:10.1109/IUCS.2010.5666229 Phandi, P., Chai, K. M., & Ng, H. T. (2015). Flexible Domain Adaptation for Automated Essay Scoring Using Correlated Linear Regression. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 431–439. doi:10.18653/v1/D15-1049 Phi, M. (25 September, 2018). Illustrated Guide to LSTM’s and GRU’s: A step by step explanation. Retrieved from Towards Data Science: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21 PIM, L. H. (19 March, 2018). Sarawak lacks well-trained teachers for three subjects. Retrieved from The Borneo Post: https://www.theborneopost.com/2018/03/19/sarawak-lacks-well-trained-teachers-for-three-subjects-manyin/ Rahutomo, F., Kitasuka, T., & Aritsugi, M. (2012). Semantic Cosine Similarity. The 7th International Student Conference on Advanced Science and Technology ICAST 2012. Rajagede, R. A., & Hastuti, R. P. (2021). Stacking Neural Network Models for Automatic. IOP Conference Series: Materials Science and Engineering, 1077, 012013. doi:10.1088/1757-899X/1077/1/012013 Ramalingam, V. V., Pandian, A., Chetry, P., & Nigam, H. (2018). Automated Essay Grading using Machine Learning Algorithm. Journal of Physics: Conference Series, 1000, 012030. doi:10.1088/1742-6596/1000/1/012030 Ramesh, D., & Sanampudi, S. K. (1 March, 2022). An automated essay scoring systems: a systematic literature review. Artificial Intelligence Review, 55(3), 2495–2527. doi:10.1007/s10462-021-10068-2 Ramineni, C., & Williamson, D. (2018). Understanding Mean Score Differences Between the e‐rater® Automated Scoring Engine and Humans for Demographically Based Groups in the GRE® General Test. ETS Research Report Series, 1, 1-31. doi:10.1002/ets2.12192 Ratna, A. A., Arbani, A., Ibrahim, I., Ekadiyanto, F., Bangun, K., & Purnamasari, P. (2018). Automatic Essay Grading System Based on Latent Semantic Analysis with Learning Vector Quantization and Word Similarity Enhancement. 120-126. doi:10.1145/3293663.3293684 Ratna, A. A., Hartanto, D., & Budiardjo, B. (2007). SIMPLE: System Automatic Essay Assessment for Indonesian Language Subject Examination. Makara Seri Teknologi. Ratna, A. A., Kaltsum, A., Santiar, L., Khairunissa, H., Ibrahim, I., & Purnamasari, P. D. (2019a). Term Frequency-Inverse Document Frequency Answer Categorization with Support Vector Machine on Automatic Short Essay Grading System with Latent Semantic Analysis for Japanese Language. 293-298. doi:10.1109/ICECOS47637.2019.8984530 Ratna, A. A., Khairunissa, H., Kaltsum, A., Ibrahim, I., & Purnamasari, P. D. (2019b). Automatic Essay Grading for Bahasa Indonesia with Support Vector Machine and Latent Semantic Analysis. 2019 International Conference on Electrical Engineering and Computer Science (ICECOS), 363-367. Ratna, A. A., Santiar, L., Ibrahim, I., Purnamasari, P., Luhurkinanti, D., & Larasati, A. (2019c). Latent Semantic Analysis and Winnowing Algorithm Based Automatic Japanese Short Essay Answer Grading System Comparative Performance. 1-7. doi:10.1109/ICAwST.2019.8923226 Ratna, A., Purnamasari, P., & Adhi, B. (2015). Simple-O, An Automated Essay Grading System for Indonesian Language Using the LSA Method with Multi-Level Keywords. Saha, S. (16 December, 2018). A Comprehensive Guide to Convolutional Neural Networks — the ELI5 way. Retrieved from Towards Data Science: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53 Saxena, S. (26 October, 2017). Artificial Neuron Networks(Basics) | Introduction to Neural Networks. Retrieved from https://becominghuman.ai/artificial-neuron-networks-basics-introduction-to-neural-networks-3082f1dcca8c Schwab, B. (2009). AI Game Engine Programming (2nd ed.). Boston: Course Technology. Sendra, M., Sutrisno, R., Harianata, J., Suhartono, D., & Asmani, A. (2016). Enhanced Latent Semantic Analysis by considering mistyped words in automated essay scoring. 304-308. doi:10.1109/IAC.2016.7905734 Shehab, A., Faroun, M., & Rashad, M. (2018). An Automatic Arabic Essay Grading System based on Text Similarity Algorithms. International Journal of Advanced Computer Science and Applications, 9. doi:10.14569/IJACSA.2018.090337 Shermis, M. D., Burstein, J., Higgins, D., & Zechner, K. (2010). Automated Essay Scoring: Writing Assessment and Instruction. International Encyclopedia of Education, 20-26. doi:10.1016/B978-0-08-044894-7.00233-5 Sim, J., & Wright, C. C. (2005). The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements. Physical Therapy, 85(3), 257–268. doi:https://doi.org/10.1093/ptj/85.3.257 Srivastava, K., Dhanda, N., & Shrivastava, A. (March, 2020). An Analysis of Automated Essay Grading Systems. International Journal of Recent Technology and Engineering (IJRTE), 8(6). doi:10.35940/ijrte.f9938.038620 Taghipour, K., & Ng, H. T. (2016). A Neural Approach to Automated Essay Scoring. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. doi:10.18653/v1/d16-1193 Tanha, J., Abdi, Y., Samadi, N., Razzaghi, N., & Asadpour, M. (2020). Boosting methods for multi-class imbalanced data classification: an experimental review. Journal of Big Data 7, 70. Retrieved from https://doi.org/10.1186/s40537-020-00349-y The Hewlett Foundation: Automated Essay Scoring. (2 February, 2012). The Hewlett Foundation: Automated Essay Scoring. Retrieved from Kaggle: https://www.kaggle.com/competitions/asap-aes/overview Uto, M. (2021). A review of deep-neural automated essay scoring models. Behaviormetrika, 48(2), 459–484. doi:10.1007/s41237-021-00142-y Uto, M., Xie, Y., & Ueno, M. (December, 2020). Neural Automated Essay Scoring Incorporating Handcrafted Features. Proceedings of the 28th International Conference on Computational Linguistics, 6077–6088. doi:10.18653/v1/2020.coling-main.535 Vantage Learning. (2005). Retrieved 4 June, 2020, from http://www.vantagelearning.com/docs/intellimetric/IM_How_IntelliMetric_Works.pdf Vantage Learning. (2020). Retrieved 4 June, 2020, from http://www.vantagelearning.com/products/intellimetric/faqs/#LongUsed Vargas, R., Mosavi, A., & Ruiz, R. (June, 2017). DEEP LEARNING: A REVIEW. Advances in Intelligent Systems and Computing, 5. Wilczek, F. (2011). What Scientific Concept Would Improve Everybodt's Cognitive Toolkit? Retrieved from https://www.edge.org/response-detail/10351 Wong, W. S., & Bong, C. H. (2019). A Study for the Development of Automated Essay Scoring (AES) in Malaysian English Test Environment. International Journal of Innovative Computing, 9(1). doi:10.11113/ijic.v9n1.220 Wong, W. S., & Bong, C. H. (April , 2021). Assessing Malaysian University English Test (MUET) Essay on Language and Semantic Features Using Intelligent Essay Grader (IEG). Pertanika Journal of Science & Technology, 29(2), 919-941. Xu, Y., Ke, D., & Su, K. (2016). Contextualized Latent Semantic Indexing: A New Approach to Automated Chinese Essay Scoring. Journal of Intelligent Systems, 26. doi:10.1515/jisys-2015-0048 Yu, J.-L., Kuo, B.-C., & Pai, K.-C. (2017). Developing Chinese Automated Essay Scoring Model to Assess College Students’ Essay Quality. Zupanc, K., & Bosnic, Z. (2015). Advances in the Field of Automated Essay Evaluation. Informatica (Slovenia), 39.