Characterizing Practices, Limitations, and Opportunities Related to Text Information Extraction Workflows: A Human-in-the-loop Perspective

ACM Conference on Human Factors in Computing Systems (CHI)(2022)

引用 12|浏览3
暂无评分
摘要
ABSTRACTInformation extraction (IE) approaches often play a pivotal role in text analysis and require significant human intervention. Therefore, a deeper understanding of existing IE practices and related challenges from a human-in-the-loop perspective is warranted. In this work, we conducted semi-structured interviews in an industrial environment and analyzed the reported IE approaches and limitations. We observed that data science workers often follow an iterative task model consisting of information foraging and sensemaking loops across all the phases of an IE workflow. The task model is generalizable and captures diverse goals across these phases (e.g., data preparation, modeling, evaluation.) We found several limitations in both foraging (e.g., data exploration) and sensemaking (e.g., qualitative debugging) loops stemming from a lack of adherence to existing cognitive engineering principles. Moreover, we identified that due to the iterative nature of an IE workflow, the requirement of provenance is often implied but rarely supported by existing systems. Based on these findings, we discuss design implications for supporting IE workflows and future research directions.
更多
查看译文
关键词
Information extraction, Data science workfows, Human-AI collaboration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要