Data analysis for a private school network

Building an adaptive learning model

Fast facts about our client

Location: Lebanon

Customer: Sabis

Industry: Education. eLearning

Years of collaboration: 2018 – till now

Fast facts

  • 80 million students were analyzed
  • Database size for building models is more than 100 terabytes

Challenge

To raise student performance and to be able to predict exam results based on attendance, in-class performance, and doing homework, the client decided to apply a data science technology. They chose this approach as their information on the educational process and students is kept in the digital form. To simulate the educational process and find behavioral patterns in the performance data, it was needed to consolidate the information from all the data sources and to apply standard data analysis techniques.

Implementation

The reason why our client initiated the project was the need to find out

  • if it’s possible to predict the results of the annual evaluation of students;

  • how to identify weak students and how to increase their chances of passing the final test;

  • if doing homework helps raise student performance.

To get the answers to these questions, our team of data analytics downloaded several tens of terabytes of the client’s archival data from different sources in the database. They studied it applying data analysis techniques and found definite patterns in the training system.

After that, we focused on developing an adaptive testing model of students. The gist of this concept is in choosing the complexity of the following question depending on whether a student gave a correct answer to the previous question or not.

To build the model mathematical simulation was used. Having collected data dumps of all the schools we tried to apply different models. A part of the data was used for the model training and another part for checking. At the stage when the model was chosen and we ensured that it functioned properly, we didn’t depend on the terabytes of the raw data anymore. We started testing the model. We made precalculations for each student and while testing, we applied a level placement algorithm based on qualitative analysis of the students’ answers.

Technology stack

Languages

Python, R

ML Libraries

TensorFlow, XGBBoost

Visualization

Shiny, Power BI, Tableau

Looking for something specific? Use the internal search