Scalable Bayesian Nonparametric Clustering and Classification

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS(2020)

引用 11|浏览43
暂无评分
摘要
We develop a scalable multistep Monte Carlo algorithm for inference under a large class of nonparametric Bayesian models for clustering and classification. Each step is "embarrassingly parallel" and can be implemented using the same Markov chain Monte Carlo sampler. The simplicity and generality of our approach make inference for a wide range of Bayesian nonparametric mixture models applicable to large datasets. Specifically, we apply the approach to inference under a product partition model with regression on covariates. We show results for inference with two motivating datasets: a large set of electronic health records and a bank telemarketing dataset. We find interesting clusters and competitive classification performance relative to other widely used competing classifiers. for this article are available online.
更多
查看译文
关键词
Electronic health records,Nonconjugate models,Parallel computing,Product partition models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要