Machine learning for flow cytometry data analysis

Machine learning for flow cytometry data analysis(2011)

引用 23|浏览4
暂无评分
摘要
This thesis concerns the problem of automatic flow cytometry data analysis. Flow cytometry is a technique for rapid cell analysis and widely used in many biomedical and clinical laboratories. Quantitative measurements from a flow cytometer provide rich information about various physical and chemical characteristics of a large number of cells. In clinical applications, flow cytometry data is visualized on a sequence of two-dimensional scatter plots and analyzed through a manual process called "gating". This conventional analysis process requires a large amount of time and labor and is highly subjective and inefficient. In this thesis, we present novel machine learning methods for flow cytometry data analysis to address these issues. We first begin by a method for generating a high dimensional flow cytometry dataset from multiple low dimensional datasets. We present an imputation algorithm based on clustering and show that it improves upon a simple nearest neighbor based approach that often induces spurious clusters in the imputed data. This technique enables the analysis of multi-dimensional flow cytometry data beyond the fundamental measurement limits of instruments. We then present two machine learning methods for automatic gating problems. Gating is a process of identifying interesting subsets of cell populations. Pathologists make clinical decisions by inspecting the results from gating. Unfortunately, this process is performed manually in most clinical settings and poses many challenges in high-throughput analysis. The first approach is an unsupervised learning technique based on multivariate mixture models. Since measurements from a flow cytometer are often censored and truncated, standard model-fitting algorithms can cause biases and lead to poor gating results. We propose novel algorithms for fitting multivariate Gaussian mixture models to data that is truncated, censored, or truncated and censored. Our second approach is a transfer learning technique combined with the low-density separation principle. Unlike conventional unsupervised learning approaches, this method can leverage existing datasets previously gated by domain experts to automatically gate a new flow cytometry data. Moreover, the proposed algorithm can adaptively account for biological variations in multiple datasets. We demonstrate these techniques on clinical flow cytometry data and evaluate their effectiveness.
更多
查看译文
关键词
flow cytometer,clinical flow cytometry data,flow cytometry data analysis,flow cytometry,automatic flow cytometry data,imputed data,new flow cytometry data,flow cytometry data,multi-dimensional flow cytometry data,high dimensional flow cytometry
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要