Differential correlation across subpopulations of single cells in subtypes of acute myeloid leukemia

biorxiv(2022)

引用 0|浏览14
暂无评分
摘要
Mass cytometers can record 40-50 parameters per single cell for millions of cells in a sample, and in particular, for leukemic cells. Many methods have been developed to cluster phenotypically similar cells within cytometry data, but there are fewer methods to visualize activity and interactions of pairs of proteins across these populations. We have developed a workflow for analyzing correlations associated with predfined populations. By clustering blood samples from acute myeloid leukemia (AML) patients and normal controls using an established algorithm, we obtained a minimum spanning tree of clusters of single cells. Using surface marker expression, we identified clusters on the tree that belonged to phenotypes of interest. Next, we computed correlations between pairs of proteins in each cluster. We developed a novel, coherent, probability-based statistic to test differences between vectors of correlation coefficients. By comparing all combinations of the normal controls under the statistic, we created an empirical distribution that provided a conservative measure of differential correlation. Using this empirically-derived distribution to define significance, we compared pooled samples from AML subtypes and normal controls to detect differential correlations. Given the structure present within this cytometry data set, we found it natural to consider correlations in this manner versus aggregating all data and computing a single correlation. Our results have the advantage that we can localize the statistical measure to determine contributions from particular phenotypic populations. Differentially correlated pairs of proteins can be further explored by considering a population’s distribution of correlation coefficients or biaxially plotting protein expressions within individual cells in a given population. Our approach leads to a better understanding of the nonlinear relationships that exist in the cytometry data. Author summary We introduce a novel method for analyzing the abundance of single cell data collected by high-throughput technologies. Due to the high dimensionality of such datasets, there is a need for methods to identify significant interactions between genes or proteins. In particular, we are interested in statistical differences between correlations of proteins within populations of cells determined by traditional immunophenotyping techniques. In this paper, we have demonstrated the utility of this new framework in the case of blood samples from individuals with different subtypes of leukemia and compare to healthy controls. The motivation for this application is that differences between can illuminate potential drivers of the disease. We have illustrated an example of how intracellular events can be detected by our statistic. Finally, this method is flexible in that it can also be applied in a variety of contexts where there are vectors of correlation coefficients that need to be compared. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
acute myeloid leukemia,single cells,subpopulations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要