Predicting students’ academic performance using multiple linear regression and principal component analysis

Stephen J.H. Yang, Owen H.T. Lu, Anna Yu Qing Huang, Jeff Cheng Hsu Huang, Hiroaki Ogata, Albert J.Q. Lin

Research output: Contribution to journalArticlepeer-review

62 Scopus citations


With the rise of big data analytics, learning analytics has become a major trend for improving the quality of education. Learning analytics is a methodology for helping students to succeed in the classroom; the principle is to predict student’s academic performance at an early stage and thus provide them with timely assistance. Accordingly, this study used multiple linear regression (MLR), a popular method of predicting students’ academic performance, to establish a prediction model. Moreover, we combined MLR with principal component analysis (PCA) to improve the predictive accuracy of the model. TraditionalMLR has certain drawbacks; specifically, the coefficient of determination (R2) and mean square error (MSE) measures and the quantile-quantile plot (Q-Q plot) technique cannot evaluate the predictive performance and accuracy of MLR. Therefore, we propose predictive MSE (pMSE) and predictive mean absolute percentage correction (pMAPC) measures for determining the predictive performance and accuracy of the regression model, respectively. Analysis results revealed that the proposed model for predicting students’ academic performance could obtain optimal pMSE and pMAPC values by using six components obtained from PCA.

Original languageEnglish
Pages (from-to)170-176
Number of pages7
JournalJournal of Information Processing
StatePublished - Jan 2018


  • Learning analytics
  • Multiple linear regression
  • Principal component analysis


Dive into the research topics of 'Predicting students’ academic performance using multiple linear regression and principal component analysis'. Together they form a unique fingerprint.

Cite this