Prediction of students' performance in e-learning environment using random forest

The need of advancement in e-learning technology causes educational data to become very huge and increase very rapidly. The data is generated daily as a result of students interaction with e-learning environment, especially learning management systems. The data contain hidden information about the p...

Full description

Saved in:
Bibliographic Details
Main Author: Yusuf, Abubakar
Format: Thesis
Language:English
Published: 2017
Subjects:
Online Access:http://eprints.utm.my/id/eprint/92210/1/AbubakarYusufMSC2017.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The need of advancement in e-learning technology causes educational data to become very huge and increase very rapidly. The data is generated daily as a result of students interaction with e-learning environment, especially learning management systems. The data contain hidden information about the participation of students in various activities of e-learning which when revealed can be used to associate with the students performance. Predicting the performance of students based on the use of e-learning system in educational institutions is a major concern and has become very important for education managements to better understand why so many students perform poorly or even fail in their studies. However, it is difficult to do the prediction due to the diverse factors or characteristics that influence their performance. This dissertation is aimed at predicting students performance by considering the students interaction in e-learning environment, their assessment marks and their prerequisite knowledge as prediction features. Random Forest algorithm, which is an ensemble of decision trees, has been used for prediction and the comparative analysis shows that the algorithm outperforms the popular decision tree and K-Nearest Neighbor algorithms. However, Naive Bayes outperformed Random Forest. In addition to the performance prediction, Random Forest was also used to identify the significant attributes that influence students performance, which was validated by a statistical test using Pearson correlation. The research therefore, revealed that lab task, assignments, midterm and prerequisite knowledge are significant indicators of students performance predictions.