Automatic Annotation of Grammaticality in Child-Caregiver Conversations
crossref(2024)
摘要
The acquisition of grammar has been a central question to adjudicate between
theories of language acquisition. In order to conduct faster, more
reproducible, and larger-scale corpus studies on grammaticality in
child-caregiver conversations, tools for automatic annotation can offer an
effective alternative to tedious manual annotation. We propose a coding scheme
for context-dependent grammaticality in child-caregiver conversations and
annotate more than 4,000 utterances from a large corpus of transcribed
conversations. Based on these annotations, we train and evaluate a range of NLP
models. Our results show that fine-tuned Transformer-based models perform best,
achieving human inter-annotation agreement levels.As a first application and
sanity check of this tool, we use the trained models to annotate a corpus
almost two orders of magnitude larger than the manually annotated data and
verify that children's grammaticality shows a steady increase with age.This
work contributes to the growing literature on applying state-of-the-art NLP
methods to help study child language acquisition at scale.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要