Early Prediction of Children’s Task Completion in a Tablet Tutor using Visual Features

Published in The Thirty-Fifth AAAI Conference on Artificial Intelligence: Student Abstract. AAAI 2021, 2021

Recommended citation: Bikram Boote*, Mansi Agarwal*, and Jack Mostow. The Thirty-Fifth AAAI Conference on Artificial Intelligence: Student Abstract. AAAI 2021.

[PDF]

Abstract

Intelligent tutoring systems could benefit from human teachers’ ability to monitor students’ affective states by watching them and thereby detecting early warning signs of disengagement in time to prevent it. Toward that goal, this paper describes a method that uses input from a tablet tutor’s user-facing camera to predict whether the student will complete the current activity or disengage from it. Training a disengagement predictor is useful not only in itself but also in identifying visual indicators of negative affective states even when they don’t lead to non-completion of the task. Unlike prior work that relied on tutor-specific features, the method relies solely on visual features and so could potentially apply to other tutors. We present a deep learning method to make such predictions based on a Long Short Term Memory (LSTM) model that uses a target replication loss function. We train and test the model on screen capture videos of children in Tanzania using a tablet tutor to learn basic Swahili literacy and numeracy. We achieve balanced-class-size prediction accuracy of 73.3% when 40% of the activity is still left.