The University of Pittsburgh English Language Institute Corpus (PELIC)

International Journal of Learner Corpus Research(2022)

引用 1|浏览4
暂无评分
摘要
Abstract This report introduces the University of Pittsburgh English Language Institute Corpus (PELIC; Juffs et al., 2020), a publicly available 4.2-million-word learner corpus of written texts. Collected over seven years in the University of Pittsburgh’s Intensive English Program, these texts were produced by more than 1,100 students with diverse linguistic backgrounds and proficiency levels. Unlike most learner corpora which are cross-sectional, PELIC is longitudinal, offering greater opportunities for tracking development in a natural classroom setting. This potential is illustrated in an overview of the research conducted to date with these data. The report also provides a description of PELIC’s creation and contents, including how the texts have been managed to facilitate natural language processing. Overall, the corpus contributes to the field of learner corpus research by adding to the pool of freely and publicly available learner corpora, supplemented by a useful set of Python tools and tutorials for accessing these data.
更多
查看译文
关键词
ESL, IEP, longitudinal development, multi-L1 corpus, PELIC
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要