UD Gheg Pear Stories: An annotated treebank of Gheg Albanian as spoken in Switzerland

Christian Ebert, Adrian Kuqi,Paul Widmer,Barbara Sonnenhauser

Research Square (Research Square)(2022)

引用 0|浏览0
暂无评分
摘要
Abstract This paper presents the Gheg Albanian Pear Stories treebank to be released in Nov. 2022, which is the first resource for Gheg in the Universal Dependencies (UD) treebank collection (Nivre et al. 2020). It also provides a special combination of spoken modality and heritage language, which both are underrepresented in UD and corpus resources in general. We provide a short description of the grammatical features of Gheg, and how they translate to categories in the UD annotation scheme in contrast with the Standard Albanian resources of Kote et al. (2019) and Toska, Nivre, and Zeman (2020). Special reference is given to the challenges arising from the spoken modality and the multi-lingual context, like disfluency, repair, and code-switching.
更多
查看译文
关键词
gheg albanian,annotated treebank,switzerland
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要