The Problem Setting

SpringerBriefs in Computer Science(2021)

引用 0|浏览6
暂无评分
摘要
We first introduce the domain of this book: low resource social media text. The domain encompasses some of the most used languages in the world and a wide variety of tasks and applications. We explore the socio-technical conditions that lead to such text, and how it influences expression online. Examples and statistics are provided from various social media data sets and recent research. We then cover attempts to bridge the resource gap between world languages like English and low-resource languages. Special attention is given to the various data acquisition strategies employed by researchers. This chapter will help NLP practitioners understand the importance of analyzing the low-resource components of corpora from various societies and how ignoring them can skew results, how to go about addressing these, and a broad set of examples and statistics to reinforce the importance of low-resource social media text mining.
更多
查看译文
关键词
Low-resource NLP, Multilinguality, L1 language, L2 language, Language preference, Language and geography, Bilingualism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要