Augmenting assessment with AI coding of online student discourse: A question of reliability

Kamila Misiejuk,Rogers Kaliisa,Jennifer Scianna

Computers and Education: Artificial Intelligence（2024）

引用 0|浏览2

暂无评分

摘要

Currently, many generative Artificial Intelligence (AI) tools are being integrated into the educational technology landscape for instructors. Our paper examines the potential and challenges of using Large Language Models (LLMs) to code student-generated content in online discussions based on intended learning outcomes and how instructors could use this to assess the intended and enacted learning design. If instructors were to rely on LLMs as a means of assessment, the reliability of these models to code the data accurately is crucial. Employing a diverse set of LLMs from the GPT family and prompting techniques on an asynchronous online discussion dataset from a blended-learning bachelor-level course, our research examines the reliability of AI-supported coding in educational research. Findings reveal that while AI-supported coding demonstrates efficiency, achieving substantial, moderate agreement with human coding for specific nuanced and context-dependent codes is challenging. Moreover, the high cost, token limits, and the advanced necessary skills needed to write API scripts might limit the usability of AI-driven coding. Finally, implementation would require specific parameterization techniques based on the class and may not be feasible for widespread implementation. Our study underscores the importance of transparency in AI coding methodologies and the need for a hybrid approach that integrates human judgment to ensure data accuracy and interpretability. In addition, it contributes to the knowledge base about the reliability of LLMs to code real, small datasets using complex codes that are common in the instructor's practice and explores the potential and challenges of using these models for assessment purposes.

查看译文

关键词

Artificial intelligence,Data coding,ChatGPT,Large language models,Learning analytics,AI-driven assessment

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要