ENFrame: A Framework for Processing Probabilistic Data.

ACM Trans. Database Syst.(2016)

引用 1|浏览36
暂无评分
摘要
This article introduces ENFrame, a framework for processing probabilistic data. Using ENFrame, users can write programs in a fragment of Python with constructs such as loops, list comprehension, aggregate operations on lists, and calls to external database engines. Programs are then interpreted probabilistically by ENFrame. We exemplify ENFrame on three clustering algorithms (k-means, k-medoids, and Markov clustering) and one classification algorithm (k-nearest-neighbour). A key component of ENFrame is an event language to succinctly encode correlations, trace the computation of user programs, and allow for computation of discrete probability distributions for program variables. We propose a family of sequential and concurrent, exact, and approximate algorithms for computing the probability of interconnected events. Experiments with k-medoids clustering and k-nearest-neighbour show orders-of-magnitude improvements of exact processing using ENFrame over naïve processing in each possible world, of approximate over exact, and of concurrent over sequential processing.
更多
查看译文
关键词
Algorithms,Systems,Data processing,probabilistic inference,data mining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要