What’s behind the correct evaluation of Machine Learning based sequencers?
The iTalk2Learn consortium has recently been discussing the correct evaluation of Machine Learning based sequencers. Such evaluation requires large data availability, large-scale experiments and consideration of different evaluation measures. Such constraints make the construction of ad-hoc Intelligent Tutoring Systems (ITS) unfeasible and impose early integration in already existing ITS, which possesses a large amount of tasks to be sequenced.
However, such systems were not designed to be combined with Machine Learning methods and require several adjustments. As a consequence, more than a half of the components based on recommended technology are never evaluated with an online experiment.
iTalk2Learn at AAAI-2015
In light of the 29th conference on Advances in Artificial Intelligence (AAAI 2015), and thanks to collaboration with Whizz Education and the University of Hildesheim, we submitted a paper entitled “Integration and Evaluation of a Matrix Factorization Sequencer in Large Commercial Intelligent Tutoring Systems” to the special Track about Integrated Systems. The scope of the call was to present the integration of Artificial Intelligence modules in larger systems and show the challenges involved.
During my presentation at the conference in Austin (Texas, USA), I had a chance to hold an audience discussion on the minimal efforts that have been dedicated to the development machine learning-driven components that can be easily integrated to support online evaluation. This is, in my opinion, a big obstacle for innovation.
Moreover, the evaluation imposes the consideration of different success indicators, which covers the different interests of the partners involved and goes beyond simple models’ error measures. For this reason, we measured perceived experience questionnaires’ results, learning gains and an exploratory data analysis.
During the poster sessions, I also had the chance to talk about iTalk2Learn’s achievements with many AAAI fellows working in the field of users’ modeling. It was interesting to see how specific challenges are common in different applications.
One very interesting discussion was about the utility of users’ modeling components that all require at first data about the user. This problem is common also in Intelligent Tutoring Systems, where first data about a student are needed for computing a personalized prediction. The discussion resulted in one of our last submissions, which we hope to be able to share with you soon!