Holistic query evaluation over information extraction pipelines
Hosted Content(2017)
摘要
AbstractWe introduce holistic in-database query processing over information extraction pipelines. This requires considering the joint conditional distribution over generic Conditional Random Fields that uses factor graphs to encode extraction tasks. Our approach introduces Canopy Factor Graphs, a novel probabilistic model for effectively capturing the joint conditional distribution given a canopy clustering of the data, and special query operators for retrieving resolution information. Since inference on such models is intractable, we introduce an approximate technique for query processing and optimizations that cut across the integrated tasks for reducing the required processing time. Effectiveness and scalability are verified through an extensive experimental evaluation using real and synthetic data.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络