TY - JOUR
T1 - Predicting students’ academic performance by using educational big data and learning analytics
T2 - evaluation of classification methods and learning logs
AU - Huang, Anna Y. Q
AU - Lu, Owen H. T
AU - Huang, Jeff C. H
AU - Yin, C. J.
AU - Yang, Stephen J. H
N1 - Publisher Copyright:
© 2019, © 2019 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2020/2/17
Y1 - 2020/2/17
N2 - In order to enhance the experience of learning, many educators applied learning analytics in a classroom, the major principle of learning analytics is targeting at-risk student and given timely intervention according to the results of student behavior analysis. However, when researchers applied machine learning to train a risk identifying model, the reason which affected the performance of the model was overlooked. This study collected seven datasets within three universities located in Taiwan and Japan and listed performance metrics of risk identification model after fed data into eight classification methods. U1, U2, and U3 were used to denote the three universities, which have three, two, and two cases of datasets (learning logs), respectively. According to the results of this study, the factors influencing the predictive performance of classification methods are the number of significant features, the number of categories of significant features, and Spearman correlation coefficient values. In U1 dataset case 1.3 and U2 dataset case 2.2, the numbers of significant features, numbers of categories of significant features, and Spearman correlation coefficient values for significant features were all relatively high, which is the main reason why these datasets were able to perform classification with high predictive ability.
AB - In order to enhance the experience of learning, many educators applied learning analytics in a classroom, the major principle of learning analytics is targeting at-risk student and given timely intervention according to the results of student behavior analysis. However, when researchers applied machine learning to train a risk identifying model, the reason which affected the performance of the model was overlooked. This study collected seven datasets within three universities located in Taiwan and Japan and listed performance metrics of risk identification model after fed data into eight classification methods. U1, U2, and U3 were used to denote the three universities, which have three, two, and two cases of datasets (learning logs), respectively. According to the results of this study, the factors influencing the predictive performance of classification methods are the number of significant features, the number of categories of significant features, and Spearman correlation coefficient values. In U1 dataset case 1.3 and U2 dataset case 2.2, the numbers of significant features, numbers of categories of significant features, and Spearman correlation coefficient values for significant features were all relatively high, which is the main reason why these datasets were able to perform classification with high predictive ability.
KW - Educational big data
KW - academic performance
KW - classification methods
KW - learning analytics
KW - learning logs
UR - http://www.scopus.com/inward/record.url?scp=85068507726&partnerID=8YFLogxK
U2 - 10.1080/10494820.2019.1636086
DO - 10.1080/10494820.2019.1636086
M3 - 期刊論文
AN - SCOPUS:85068507726
SN - 1049-4820
VL - 28
SP - 206
EP - 230
JO - Interactive Learning Environments
JF - Interactive Learning Environments
IS - 2
ER -