학술논문

Student Retention Using Educational Data Mining and Predictive Analytics: A Systematic Literature Review
Document Type
Periodical
Source
IEEE Access Access, IEEE. 10:72480-72503 2022
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Education
Data mining
Systematics
Predictive models
Prediction algorithms
Bibliographies
Soft sensors
Educational data mining
learning analytics
machine learning
predictive models
student retention
Language
ISSN
2169-3536
Abstract
Student retention is an essential measurement metric in education, indicated by retention rates, which are accumulated as students re-enroll from one academic year to the next. High retention rates can be obtained if institutions aim to provide appropriate support and teaching methods among the various practices to prevent students from deferring their studies. To address this pressing challenge faced by educational institutions, the underlying factors and the methodological aspects of building robust predictive models are reviewed and scrutinized. Educational Data Mining (EDM) and Learning Analytics (LA) have been widely adopted for knowledge discovery from educational data sources, improving the teaching practice, and identifying at-risk students. Various predictive techniques are applied in LA, such as Machine Learning (ML), Statistical Analysis, and Deep Learning (DL). To gain an in-depth review of these techniques, academic publications have been reviewed to highlight their potential to resolve Student Retention issues in education. Additionally, the paper presents a taxonomy of ML approaches and a comprehensive review of the success factors and the features that are not indicative of student performance in three different learning environments: Traditional Learning, Blended Learning, and Online Learning. The survey reveals that supervised ML and DL techniques are broadly applied in Student Retention. However, the application of ensemble and unsupervised learning clustering techniques supporting the heterogenous and homogenous groups of students is generally lacking. Moreover, static and traditional features are commonly used in student performance, ignoring vital factors such as educators-related, cognitive, and personal data. Furthermore, the paper highlights open challenges for future research directions.