Learning Analytics for Predicting Student Performance in Online Learning Environments
DOI:
https://doi.org/10.70148/rise.v3i4.11Keywords:
Learning Analytics, Online Learning Environments, Student Engagement, Student Performance PredictionAbstract
The fast‑growing numbers of the online learning space have led to the storage of huge amounts of student interaction data in Learning Management Systems (LMS). However, educational institutions often lack systematic systems to use such data to help identify students who are at risk of future academic failure. This paper fills this gap by building and testing predictive models to predict academic performance of students through learning analytics. Using a quantitative research design, we studied the interaction logs, assessment data, and recorded engagement of 384 university students taking a semester‑long online course through Moodle. The most essential behavioral variables, such as the number of logins, the timeliness of assignment submissions, discussion forum activity, and video lecture viewing, were extracted and were used to train and compare various machine learning models, namely, Logistic Regression, Decision Tree [added: now includes Decision Tree as in results], Random Forest, and Support Vector Machine. Accuracy, precision, recall, F1‑score, and ROC‑AUC were used to measure model performance. Findings indicate that the highest predictive accuracy was achieved by Random Forest (87.0%), with an ROC‑AUC of 0.91, and the assignment submission pattern and regular login frequency were the most potent predictors of ultimate academic achievement. These findings highlight the possibility of learning analytics to support early warning systems based on data, enabling timely pedagogical interventions. This paper contributes to the literature on educational data mining through empirical evidence of the relationships between behavioral indicators derived from conventional LMS logs and their strong predictive abilities for student results, providing practical implications for instructors, instructional designers, and institutional policymakers seeking to enhance student learning and tailor support in online learning settings.
References
Arnold, K. E., & Pistilli, M. D. (2012). Course signals at Purdue: Using learning analytics to increase student success. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, 267–270. https://doi.org/10.1145/2330601.2330666
Azevedo, R. (2015). Defining and measuring engagement and learning in science: Conceptual, theoretical, methodological, and analytical issues. Educational Psychologist, 50(1), 84–94. https://doi.org/10.1080/00461520.2015.1004069
Baker, R. S., & Inventado, P. S. (2014). Educational data mining and learning analytics. In J. A. Larusson & B. White (Eds.), Learning analytics: From research to practice (pp. 61–75). Springer. https://doi.org/10.1007/978-1-4614-3305-7_4
Bandura, A. (1986). Social foundations of thought and action: A social cognitive theory. Prentice Hall.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
Broadbent, J., & Poon, W. L. (2015). Self regulated learning strategies & academic achievement in online higher education learning environments: A systematic review. The Internet and Higher Education, 27, 1–13. https://doi.org/10.1016/j.iheduc.2015.04.007
Coates, H., James, R., & Baldwin, G. (2005). A critical examination of the effects of learning management systems on university teaching and learning. Tertiary Education and Management, 11(1), 19–36. https://doi.org/10.1080/13583883.2005.9967137
Conijn, R., Snijders, C., Kleingeld, A., & Matzat, U. (2017). Predicting student performance from LMS data: A comparison of 17 blended courses using Moodle LMS. IEEE Transactions on Learning Technologies, 10(1), 17–29. https://doi.org/10.1109/TLT.2016.2616312
Cortes, C., & Vapnik, V. (1995). Support vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018
Creswell, J. W., & Creswell, J. D. (2018). Research design: Qualitative, quantitative, and mixed methods approaches (5th ed.). SAGE Publications.
Dawson, S. (2006). A study of the relationship between student communication interaction and sense of community. The Internet and Higher Education, 9(3), 153–162. https://doi.org/10.1016/j.iheduc.2006.06.007
Dhawan, S. (2020). Online learning: A panacea in the time of COVID 19 crisis. Journal of Educational Technology Systems, 49(1), 5–22. https://doi.org/10.1177/0047239520934018
Ferguson, R. (2012). Learning analytics: Drivers, developments and challenges. International Journal of Technology Enhanced Learning, 4(5 6), 304–317. https://doi.org/10.1504/IJTEL.2012.051816
Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed.). SAGE Publications.
Gašević, D., Dawson, S., & Siemens, G. (2015). Let’s not forget: Learning analytics are about learning. TechTrends, 59(1), 64–71. https://doi.org/10.1007/s11528-014-0822-x
Gašević, D., Dawson, S., Rogers, T., & Gasevic, D. (2016). Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher Education, 28, 68–84. https://doi.org/10.1016/j.iheduc.2015.10.002
Giannakos, M. N., Chorianopoulos, K., & Chrisochoides, N. (2015). Making sense of video analytics: Lessons learned from clickstream interactions, attitudes, and learning outcome in a video assisted course. International Review of Research in Open and Distributed Learning, 16(1), 260–283. https://doi.org/10.19173/irrodl.v16i1.1976
Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate data analysis (8th ed.). Cengage Learning.
Hart, C. (2012). Factors associated with student persistence in an online program of study: A review of the literature. Journal of Interactive Online Learning, 11(1), 19–42. https://www.ncolr.org/jiol/issues/view/issue/11/1
Hellas, A., Ihantola, P., Petersen, A., Ajanovski, V. V., Gutica, M., Hynninen, T., Knutas, A., Leinonen, J., Messom, C., & Liao, S. N. (2018). Predicting academic performance: A systematic literature review. Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, 175–199. https://doi.org/10.1145/3293881.3295783
Henrie, C. R., Halverson, L. R., & Graham, C. R. (2015). Measuring student engagement in technology mediated learning: A review. Computers & Education, 90, 36–53. https://doi.org/10.1016/j.compedu.2015.09.005
Hodges, C., Moore, S., Lockee, B., Trust, T., & Bond, A. (2020). The difference between emergency remote teaching and online learning. Educause Review, 27, 1–12. https://er.educause.edu/articles/2020/3/the-difference-between-emergency-remote-teaching-and-online-learning
Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (3rd ed.). John Wiley & Sons. https://doi.org/10.1002/9781118548387
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning: With applications in R (2nd ed.). Springer. https://doi.org/10.1007/978-1-0716-1418-1
Januszewski, A., & Molenda, M. (Eds.). (2008). Educational technology: A definition with commentary. Routledge.
Joksimović, S., Gašević, D., Loughin, T. M., Kovanović, V., & Hatala, M. (2016). Learning at distance: Effects of interaction traces on academic achievement. Computers & Education, 102, 154–167. https://doi.org/10.1016/j.compedu.2016.09.002
Kim, J., Li, S., & Bonk, C. J. (2014). Factors affecting online learner success: A cross institutional study. Journal of Interactive Learning Research, 25(4), 549–573. https://www.learntechlib.org/primary/p/147280/
Koedinger, K. R., D’Mello, S., McLaughlin, E. A., Pardos, Z. A., & Rosé, C. P. (2015). Data mining and education. Wiley Interdisciplinary Reviews: Cognitive Science, 6(4), 333–353. https://doi.org/10.1002/wcs.1350
Kovanović, V., Gašević, D., Dawson, S., Joksimović, S., & Baker, R. S. (2015). Does time on task estimation matter? Implications for the validity of learning analytics findings. Journal of Learning Analytics, 2(3), 81–110. https://doi.org/10.18608/jla.2015.23.6
Kuh, G. D. (2009). The national survey of student engagement: Conceptual and empirical foundations. New Directions for Institutional Research, 2009(141), 5–20. https://doi.org/10.1002/ir.283
Lang, C., Siemens, G., Wise, A., & Gašević, D. (Eds.). (2017). Handbook of learning analytics. Society for Learning Analytics Research. https://doi.org/10.18608/hla17
Lee, Y., & Choi, J. (2011). A review of online course dropout research: Implications for practice and future research. Educational Technology Research and Development, 59(5), 593–618. https://doi.org/10.1007/s11423-010-9177-y
Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & Education, 54(2), 588–599. https://doi.org/10.1016/j.compedu.2009.09.008
Miles, M. B., Huberman, A. M., & Saldaña, J. (2020). Qualitative data analysis: A methods sourcebook (4th ed.). SAGE Publications.
Morris, L. V., Finnegan, C., & Wu, S. S. (2005). Tracking student behavior, persistence, and achievement in online courses. The Internet and Higher Education, 8(3), 221–231. https://doi.org/10.1016/j.iheduc.2005.06.009
Murphy, E., & Rodríguez Manzanares, M. A. (2009). Teachers’ perspectives on motivation in high school distance education. Journal of Distance Education, 23(3), 1–24. https://www.ijede.ca/index.php/jde/article/view/543
Osmanbegović, E., & Suljić, M. (2012). Data mining approach for predicting student performance. Economic Review, 10(1), 3–12. https://www.ef.untz.ba/wp-content/uploads/2017/06/ER_2012_10_1_osmanbegovic.pdf
Park, Y., & Jo, I. H. (2015). Development of the learning analytics dashboard to support students’ learning performance. Journal of Universal Computer Science, 21(1), 110–133. https://doi.org/10.3217/jucs-021-01-0110
Patterson, B., & McFadden, C. (2009). Attrition in online and campus degree programs. Online Journal of Distance Learning Administration, 12(2), 1–8. https://ojdla.com/archive/spring82/patterson82.pdf
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830. https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html
Pintrich, P. R. (2004). A conceptual framework for assessing motivation and self regulated learning in college students. Educational Psychology Review, 16(4), 385–407. https://doi.org/10.1007/s10648-004-0006-x
Powers, D. M. W. (2020). Evaluation: From precision, recall and F measure to ROC, informedness, markedness and correlation. Journal of Machine Learning Technologies, 2(1), 37–63. https://arxiv.org/abs/2010.16061
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. https://doi.org/10.1007/BF00116251
R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
Romero, C., & Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(3), e1355. https://doi.org/10.1002/widm.1355
Sclater, N., Peasgood, A., & Mullan, J. (2016). Learning analytics in higher education: A review of UK and international practice. Jisc. https://www.jisc.ac.uk/reports/learning-analytics-in-higher-education
Siemens, G., & Baker, R. S. J. d. (2012). Learning analytics and educational data mining: Towards communication and collaboration. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, 252–254. https://doi.org/10.1145/2330601.2330661
Siemens, G., & Long, P. (2011). Penetrating the fog: Analytics in learning and education. Educause Review, 46(5), 30–32. https://er.educause.edu/articles/2011/9/penetrating-the-fog-analytics-in-learning-and-education
Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics (7th ed.). Pearson Education. [no DOI]
Tempelaar, D. T., Rienties, B., & Giesbers, B. (2015). In search for the most informative data for feedback generation: Learning analytics in a data rich context. Computers in Human Behavior, 47, 157–167. https://doi.org/10.1016/j.chb.2014.05.038
Tlili, A., Zhang, J., Papamitsiou, Z., Manske, S., Huang, R., Kinshuk, & Hoppe, H. U. (2020). Towards utilising emerging technologies to address the challenges of using Open Educational Resources: A vision of the future. Educational Technology Research and Development, 68(2), 789–809. https://doi.org/10.1007/s11423-019-09732-4
Turnbull, D., Chugh, R., & Luck, J. (2021). Learning management systems: A review of the research literature. Journal of Information Technology Education: Research, 20, 99–121. https://doi.org/10.28945/4689
Viberg, O., Hatakka, M., Bälter, O., & Mavroudi, A. (2018). The current landscape of learning analytics in higher education. Computers in Human Behavior, 89, 98–110. https://doi.org/10.1016/j.chb.2018.07.027
Winne, P. H., & Hadwin, A. F. (2008). The weave of motivation and self regulated learning. In D. H. Schunk & B. J. Zimmerman (Eds.), Motivation and self regulated learning: Theory, research, and applications (pp. 297–314). Lawrence Erlbaum Associates.
Wise, A. F., Speer, J., Marbouti, F., & Hsiao, Y. T. (2012). Broadening the notion of participation in online discussions: Examining patterns in learners’ online listening behaviors. Instructional Science, 41(2), 323–343. https://doi.org/10.1007/s11251-012-9230-9
World Economic Forum. (2021). The future of online learning: How education technology is reshaping global education. World Economic Forum. https://www.weforum.org/reports/the-future-of-online-learning/
You, J. W. (2016). Identifying significant indicators using LMS data to predict course achievement in online learning. The Internet and Higher Education, 29, 23–30. https://doi.org/10.1016/j.iheduc.2015.11.003
Yu, T., & Jo, I. H. (2014). Educational technology approach toward learning analytics: Relationship between student online learning behavior and academic performance. Proceedings of the 4th International Conference on Learning Analytics and Knowledge, 269–270. https://doi.org/10.1145/2567574.2567594
Zimmerman, B. J. (2002). Becoming a self regulated learner: An overview. Theory Into Practice, 41(2), 64–70. https://doi.org/10.1207/s15430421tip4102_2
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Sayed Mahbub Hasan Amiri (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.








