The Kolipsi Corpus Family: Resources for Learner Corpus Research in Italian and German

IJCoL(2023)

引用 0|浏览0
暂无评分
摘要
This article describes the Kolipsi Corpus Family (KCF), a collection of eight related resources for learner corpus research in German and Italian. The KCF supports the study of second language (L2) acquisition of Italian and German in upper secondary schools. It subsumes four L2 corpora with comparable corpus design (with respect to data collection, writing tasks, additional metadata, annotation and processing), portraying two homogeneous learner groups and their learner varieties. The corpora are representative of language learners in the multilingual Italian province of South Tyrol, where both languages are taught daily. The L2 corpora were collected at two different points in time, in 2007 (Kolipsi-1) and 2014 (Kolipsi-2), and all texts were labeled with CEFR levels to allow comparisons of proficiency levels across time. L2 German texts were collected in schools with Italian as the main language of instruction, whereas L2 Italian texts were collected in schools with German as the main language of instruction. Additional resources within the KCF allow researchers to compare the students’ language competences in their L2 with the language competences in their first language (L1) in a different task (Kolipsi-Matura) and with similarly aged L1 writers performing the same task (Kolipsi-1-L1). All texts are freely available to the scientific community. Access to the data is granted via an ANNIS search interface and via the Eurac Research CLARIN Repository, from which corpus data can be downloaded in various formats.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要