Machine Learning Approach for Handling Imbalanced
Students’ Performance Data
Journal:
GRENZE International Journal of Engineering and Technology
Authors:
E. Sujatha, S.Divya, R.G.Sakthivelan, Gopirajan PV
Volume:
10
Issue:
2
Grenze ID:
01.GIJET.10.2.163
Pages:
3805-3811
Abstract
Imbalanced student performance data in educational institutions is crucial for any
machine learning prediction model. It affects the efficiency of classifiers and challenges the
sampling methods for having a more significant number of features. The proposed model was
designed to predict the student's performance by comparing the results with popular similar
models. The Kaggle dataset having 386 rows and 33 columns of student data, was used in this
study. In addition, the Portugal database was considered for experimental analysis, containing
394 students and 19 attributes. It proves that the Random Forest classifier yields the highest
accuracy at 84.15% in handling imbalanced datasets. The results show that the SVM-SMOTE is
higher accuracy as, 94.76%, than the other sampling methods in predicting the student's
performance with various features.