Comparing Human Text Classification Performance and Explainability with Large Language and Machine Learning Models Using Eye-Tracking

Jeevithashree Divya Venkatesh,Aparajita Jaiswal,Gaurav Nanda

crossref(2024)

引用 0|浏览0
暂无评分
摘要
Abstract To understand the alignment between reasonings of humans and artificial intelligence (AI) models, this empirical study compared the human text classification performance and explainability with a traditional machine learning (ML) model and large language model (LLM). A domain-specific noisy textual dataset of injury narratives had to be classified into six cause-of-injury codes. While the ML model was trained on pre-labelled injury narratives, LLM and humans did not receive any specialized training. The explainability of different approaches was compared using the words they focused on during classification. These words were identified using eye-tracking for humans, explainable AI approach LIME for ML model, and prompts for LLM. The classification performance of ML model was relatively better than LLM and humans- overall and particularly for complicated and challenging to classify narratives. The top-3 words used by ML and LLM for classification agreed with humans to a greater extent as compared to later words.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要