Early Steps Toward Web-Scale Information Extraction With Lodie

AI Magazine(2015)

引用 3|浏览32
暂无评分
摘要
Information extraction (IF) is the technique for transforming unstructured textual data into a structured representation that can be understood by machines. The exponential growth of the web generates an exceptional quantity of data for which automatic knowledge capture is essential. This work describes the methodology for web-scale information extraction in the linked open data information-extraction (LODIE) project and highlights results fiom the early experiments carried out in the initial phase of the project. LODIE aims to develop information-extraction techniques able to scale at web level and adapt to user information needs. The core idea behind LODIE is the usage of linked open data, a very large-scale information resource, as a ground-breaking solution for IE, which provides invaluable annotated data on a growing number of domains. This article has two objectives, first, describing the LODIE project as a whole and depicting its general challenges and directions; and second, describing some initial steps taken toward the general solution, focusing on a specific IE sub-task, wrapper induction.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要