Knowledge-Guided Bayesian Support Vector Machine for High-Dimensional Data with Application to Analysis of Genomics Data.

2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)(2018)

引用 8|浏览10
暂无评分
摘要
Support vector machine (SVM) is a popular classification method for the analysis of wide range of data including big data. Many SVM methods with feature selection have been developed under frequentist regularization or Bayesian shrinkage frameworks. On the other hand, the importance of incorporating a priori known biological knowledge, such as gene pathway information which stems from the gene regulatory network, into the statistical analysis of genomic data has been recognized in recent years. In this article, we propose a new Bayesian SVM approach that enables the feature selection to be guided by the knowledge on the graphical structure among predictors. The proposed method uses the spike-and-slab prior for feature selection, combined with the Ising prior that encourages group-wise selection of the predictors adjacent to each other on the known graph. Gibbs sampling algorithm is used for Bayesian inference. The performance of our method is evaluated and compared with existing SVM methods in terms of prediction and feature selection in extensive simulation settings. In addition, our method is illustrated in the analysis of genomic data from a cancer study, demonstrating its advantage in generating biologically meaningful results and identifying potentially important features.
更多
查看译文
关键词
Bayesian support vector machine,Ising prior,Spike-and-slab prior,knowledge-guided,pathway graph information
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要