MONITORING OF THE EDUCATION QUALITY AND IMPLEMENTING OF INDIVIDUAL LEARNING: DEMONSTRATION OF APPROACHES AND EDUCATIONAL DATA MINING ALGORITHMS

  • Yass Khudheir Salal South Ural State University
  • S. M. Abdullaev South Ural State University
Keywords: Individual and collective student performance forecasts, educational data mining, classification and quantification, imbalanced datasets, heterogeneous ensembles, deterministic and probabilistic forecast

Abstract

The quality monitoring system for traditional and distance education requires the development
of machine learning classification and quantification techniques necessary to predict individual
and collective student performance. This article theoretically and experimentally shows that
the most promising approach that simultaneously solves both forecast tasks is to create heterogeneous
ensembles consisting of an odd number of different base classifiers, such as decision trees,
simple neural networks, naive Bayesian classifier and others. By training and testing 11 different
binary classifiers on six different samples of educational data, we show that the individual determined
forecast of such ensembles exceeds the accuracy of forecasts of both individual base classifiers
and homogeneous ensembles created by bagging and busting technologies. The advantage of
heterogeneous ensembles is decisive when we deal with the imbalance of sample characteristic ofeducational data. In these cases, only the forecasts with accuracies exceeding the relative frequency
of the class of objects dominating in the sample of data can be considered as useful forecasts.
The main advantage of the heterogeneous ensemble is the ability to transform the deterministic
forecast into a probabilistic forecast, when instead of referring the object to a particular class, the
probability of its belonging to individual classes is given. On this basis, we have proposed a new
method of binary quantification, where individual probabilities of belonging to each of the classes
of objects are summed up separately, and the resulting total probabilities are interpreted as relative
frequencies of objects in the sample. As a result of experiments, it is shown that such ensemble
binary quantification is significantly superior to the traditional "classify and count" method.

References

1. OECD (2019), PISA 2018 Results (Vol. I): What Students Know and Can Do, PISA, OECD
Publishing, Paris. Available at: https://doi.org/10.1787/5f07c754-en.
2. Abdullaev S.M., Lenskaya O.Yu., Salal Ya.K. Computer Systems of Individual Instruction:
Background and Perspectives, Bulletin of SUSU. Series "Education. Pedagogical science»,
2018, Vol. 10, No. 4, pp. 64-71. DOI: 10.14529/ped180408. (in Russ.).
3. Abdullaev S.M, Salal Y.K. State and Prospects of Secondary Education System in the Iraqi
Republic: to The Question of Monitoring the Quality of Education, Continuing education: Materials
of the International forum: V International scientific and practical conference, ed. by
I.A. Voloshinoy, I.O. Kotlyarovoy, 2019, Vol. 1, pp.14-26.
4. Noskov M.V., D'yachuk P.P., Dobronets B.S., Vaynshteyn Yu.V., Kytmanov A.A., i dr. The
evolution of education in the context of informatization. Krasnoyarsk, 2019, Sibirskiy
federal'nyy universitet, 212 p. (in Russ.).
5. Silkina N.S., Sokolinskiy L.B. Structural-Hierarchical Didactic Model of E-learning, Bulletin of
SUSU. Series: Computational mathematics and computer science, 2019, Vol. 8 (4), pp. 56-83.
DOI:10.14529/cmse190405. (in Russ.)
6. Abdullaev S.M., Salal Y.K. Ensemble Classification and Quntification: to Individual and Collective
Student Performance Forecast, Continuing education: international forum November
22, 2019: Materials of the fifth international scientific and practical conference " University of
the XXI century in the system of continuing education», ed.by I.A. Voloshinoy, I.O. Kotlyarovoy,
2019, Vol.1, pp. 3-13. (in Russ.).
7. Abdullaev S.M, Salal Y.K. Economic deterministic ensemble classifiers with probabilistic output
using for robust quantification: study of unbalanced educational datasets, 1st International
Scientific and Practical Conference on Digital Economy (ISCDE 2019), Advances in Economics,
Business and Management Research, 2019, Vol. 105, pp. 658-665. DOI:10.2991/iscde-
19.2019.128.
8. Nesterov S.A., Smolina E.M. The Assessment оf the Results of a Massive Open Online Course
Using Data Mining Methods, Information and Telecommunication Technology in Education,
2020, Vol. 13, No, 1, pp. 65-78. DOI: 10.18721/JCSTCS.13106.
9. Mukesh Kumar, Salal Y.K. Systematic Review of Predicting Student’s Performance in Academics,
International Journal of Engineering and Advanced Technology (IJEAT), 2019,
Vol. 8, No. 3, pp. 54-61.
10. WEKA The workbench for machine learning (available at https://www.cs.waikato.
ac.nz/ml/weka/).
11. Sadiq Hussain, Zahraa Fadhil Mushin, Yass Khudheir Salal, Paraskevi Theodorou, Fikriye
Kutoglu, Hazarika G.C. Prediction Model on Student Performance based on Internal Assessment
using Deep Learning, 2019, Vol. 14(8), pp. 4-22. DOI: 10.3991/ijet.v14i08.10001.
12. Mudasir Ashraf Bhat, Salal Y.K., Abdullaev S.M. Educational data mining using base (individual)
and ensemble learning approaches to predict the performance of students. MIND 2019: 1st
International Conference on Machine Learning, Image Processing, Network Security and Data
Sciences. Submitted for publication.
13. Salal Y.K., Abdullaev S.M. Optimization of Classifiers Ensemble Construction: Case Study of
Educational Data Mining, Bulletin of the South Ural State University. Ser. Computer Technologies,
Automatic Control, Radio Electronics, 2019, Vol. 19, No. 4, pp. 139-143. DOI:
10.14529/ctcr190414
14. Salal Y.K., Abdullaev S.M., Mukesh Kumar. Educational Data Mining: Student Performance
Prediction in Academic, Inter. Journal of Engineering and Advanced Technology (IJEAT),
April 2019, Vol. 8 (4) ,pp. 54-59.
15. Salal Y.K., Abdullaev S.M. Educational data mining using base and ensemble Learning approaches
to predict student’s performance. Informatizaciya-i-Svyaz, 2019, No. 5, pp.140-143.
16. Chawla N.V. Data mining for imbalanced datasets: An overview, Data Mining and Knowledge
Discovery Handbook, Springer. Boston, MA, 2010, pp. 875-886. DOI: 10.1007/978-0-387-
09823-4_455.
17. Galar, M., Fernandez, A., Barrenechea, E., Bustince, H. and Herrera, F. X A review on ensembles
for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches,
IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews),
2011, Vol. 42 (4), pp. 463-484. DOI: 10.1109/TSMCC.2011.2161285.
18. Hartshorne J.K. and Germine L.T. When does cognitive functioning peak? The asynchronous
rise and fall of different cognitive abilities across the life span, Psychological science, 2015,
Vol. 26 (4), pp. 433-443. DOI:10.1177/0956797614567339.
19. Abdullaev S.M., Lenskaya O.Yu., Salal Ya.K. Computer Systems of Individual Instruction:
Features of Student Model, Proceedings of the IV international scientific and practical conference,
October 11-12, 2018, Chelyabinsk, pp. 7-14. (in Russ.).
20. González P., Castaño A., Chawla N.V., Coz J.J.D. A review on quantification learning, ACM
Computing Surveys (CSUR), 2017, Vol. 50, No. 5, pp. 1-40. DOI: 10.1145/311780.
21. Forman G. Quantifying counts and costs via classification, Data Mining and Knowledge
Discov, 2008, Vol. 17, No. 2, pp. 164-206. DOI: 10.1007/s10618-008-0097-y.
Published
2020-10-11
Section
SECTION II. INFORMATION PROCESSING ALGORITHMS