Parallelizing the CKY and Earley Parsing Algorithms

Silas Boyd-Wickizer,Austin Clements,Neha Narula

semanticscholar(2012)

引用 2|浏览5
暂无评分
摘要
Context-free parsing algorithms are one of the oldest and most well-understood aspects of natural language processing. Efforts to reduce the time complexity of these algorithms have produced two particularly popular algorithms: the Cocke-Kasami-Younger (CKY) bottomup parsing algorithm [5, 9], and the Earley top-down parsing algorithm [2, 3]. However, despite these efforts, parsing remains a time-consuming process because typical natural language grammars are very large and human language tends to produce highly ambiguous sentences with many possible parses, even for seemingly straightforward sentences. While ambiguity is the bane of parsing performance for these algorithms, it represents a perfect opportunity to take advantage of recent developments in multicore hardware. As of this writing, many general-purpose 16 processor machines exist and the number of processors is rapidly increasing. In order to take advantage of such hardware, however, algorithms must be redesigned to divide work into largely independent parts. We examine both the CKY algorithm and the Earley algorithm in the context of modern multi-processor hardware and modify both algorithms to take advantage of the parallelism available with such machines. We demonstrate the CKY is highly amenable to parallelization because it consists of many largely independent operations, allowing us to avoid expensive synchronization, create large amounts of fine-grained parallelism, and organize operations to take advantage of high-speed on-processor cache memory. We also show how to extract parallelism from the Earley algorithm, though it suffers from much harder synchronization requirements.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要