UVAE: Integration of Heterogeneous Unpaired Data with Imbalanced Classes

biorxiv(2023)

引用 0|浏览5
暂无评分
摘要
We introduce the Unbiasing Variational Autoencoder (UVAE), a novel computational framework developed for the integration of unpaired biomedical data streams, with a particular focus on clinical flow cytometry. UVAE effectively addresses the challenges of batch effect correction and data alignment by training a semi-supervised model on partially labeled datasets. This approach enables the simultaneous normalisation and integration of diverse data within a shared latent space. The framework is implemented in Python with a descriptive interface for the specification and incorporation of multiple, partially overlapping data series. UVAE employs a probabilistic model for batch effect normalisation, with a generative capacity for unbiased data reconstruction and inference from heterogeneous samples. Its training process strategically balances class contents during various stages, ensuring accurate representation in statistical analyses. The model's convergence is achieved through a stable, non-adversarial training mechanism, complemented by an automated selection of hyper-parameters via Bayesian optimization. We quantitatively validate the performance of UVAE's constituent components and consequently apply it to the real problem of integrating heterogeneous clinical flow cytometry data collected from COVID-19 patients. We show that the alignment process enhances the statistical signal of cell types associated with severity and enables clustering of subpopulations without the impediment of batch effects. Finally, we demonstrate that homogeneous data generated by UVAE can be used to improve the performance of longitudinal regression for predicting peak disease severity from temporal patient samples. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要