Global Open Resources and Information for Language and Linguistic Analysis (GORILLA).

LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION(2016)

引用 25|浏览3
暂无评分
摘要
The infrastructure Global Open Resources and Information for Language and Linguistic Analysis (GORILLA) was created as a resource that provides a bridge between disciplines such as documentary, theoretical, and corpus linguistics, speech and language technologies, and digital language archiving services. GORILLA is designed as an interface between digital language archive services and language data producers. It addresses various problems of common digital language archive infrastructures. At the same time it serves the speech and language technology communities by providing a platform to create and share speech and language data from low-resourced and endangered languages. It hosts an initial collection of language models for speech and natural language processing (NLP), and technologies or software tools for corpus creation and annotation. GORILLA is designed to address the Transcription Bottleneck in language documentation, and, at the same time to provide solutions to the general Language Resource Bottleneck in speech and language technologies. It does so by facilitating the cooperation between documentary and theoretical linguistics, and speech and language technologies research and development, in particular for low-resourced and endangered languages.
更多
查看译文
关键词
Speech and Language Corpora,NLP,Low-resourced Languages,Transcription Bottleneck
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要