Commonsense Temporal Action Knowledge (CoTAK) Dataset

PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023(2023)

引用 0|浏览6
暂无评分
摘要
This paper presents a publicly available, large-scale dataset resource, CoTAK (COmmonsense Temporal Action Knowledge) consisting of short descriptions of action-describing sentences manually annotated with temporal commonsense knowledge. The dataset consists of over 300K instructional sentences extracted from WikiHow, which are annotated with commonsense knowledge-based temporal labels indicating implicitly understood information about the actions described by the sentences, including approximately how long an action takes to perform and approximately how long its effects last for. For short duration actions labeled as taking seconds or minutes, which would be of relevance to automated task planning, e.g. in robotics applications, the dataset also provides scalar values to accurately label the temporal durations of how long actions take to perform. Experimental results are presented demonstrating that state-of-the-art machine learning techniques such as fine-tuning of large language models are effective in making predictions of commonsense temporal knowledge using the dataset, with up to 80% accuracy, showing the high utility and promising impact of the constructed resource and its applicability towards generating commonsense temporal knowledge relevant to various applications.
更多
查看译文
关键词
dataset,temporal commonsense knowledge,text classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要