Engagement Recognition in Online Learning Based on an Improved Video Vision Transformer.

Zijian Guo, Zhuoyi Zhou,Jiahui Pan,Yan Liang

IJCNN(2023)

引用 0|浏览4
暂无评分
摘要
Online learning has gained wide attention and application due to its flexibility and convenience. However, due to the separation of time and space, the level of students' engagement is not easily informed by teachers, which affects the effectiveness of teaching. Automatic detection of students' engagement is an effective way to solve this problem. It can help teachers obtain timely feedback from students and adjust the teaching schedule. In this paper, transformer is first applied in engagement recognition and a novel network based on an improved video vision transformer (ViViT) is proposed to detect student engagement. A new transformer encoder, named Transformer Encoder with Low Complexity (TELC) is proposed. It adopts unit force operated attention (UFO-attention) to eliminate the nonlinearity of the original self-attention in standard ViViT and Patch Merger to fuse the input patches, which allows the network to significantly reduce computational complexity while improving performance. The proposed method is evaluated on the Dataset for Affective States in E-learning Environments (DAiSEE) and achieves an accuracy of 63.91% in the four-level classification task, which is superior to state-of-the-art methods. The experimental results demonstrate the effectiveness of our method, which is more suitable for the practical application of online learning.
更多
查看译文
关键词
automatic detection,DAiSEE,dataset for affective states in e-learning environments,engagement recognition,four-level classification task,improved video vision transformer,online learning,original self-attention,patch merger,standard ViViT,student engagement,teaching schedule,TELC,timely feedback,transformer encoder with low complexity,UFO-attention,unit force operated attention,ViViT
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要