Tissue of origin detection for cancer tumor using low-depth cfDNA samples through combination of tumor-specific methylation atlas and genome-wide methylation density in graph convolutional neural networks

biorxiv(2023)

引用 0|浏览4
暂无评分
摘要
Background Cell free DNA (cfDNA)-based assays hold great potential in detecting early cancer signals yet determining the tissue-of-origin (TOO) for cancer signals remains a challenging task. Here, we investigated the contribution of a methylation atlas to TOO detection in low depth cfDNA samples. Methods We constructed a tumor-specific methylation atlas (TSMA) using whole-genome bisulfite sequencing (WGBS) data from five types of tumor tissues (breast, colorectal, gastric, liver and lung cancer) and paired white blood cells (WBC). TSMA was used with a non-negative least square matrix factorization (NNLS) deconvolution algorithm to identify the abundance of tumor tissue types in a WGBS sample. We showed that TSMA worked well with tumor tissue but struggled with cfDNA samples due to the overwhelming amount of WBC-derived DNA. To construct a model for TOO, we adopted the multi-modal strategy and used as inputs the combination of deconvolution scores from TSMA with other features of cfDNA. Results Our final model comprised of a graph convolutional neural network using deconvolution scores and genome-wide methylation density features, which achieved an accuracy of 69% in a held-out validation dataset of 239 low-depth cfDNA samples. Conclusions In conclusion, we have demonstrated that our TSMA in combination with other cfDNA features can improve TOO detection in low-depth cfDNA samples. ### Competing Interest Statement THN, NHN, HG, LST, MDP receive compensation and have an equity interest in Gene Solutions. NNTD, THT, VTCN, THHN, GTHN are employees of Gene Solutions. The authors ensure that this does not alter the accuracy or integrity of the manuscript. The study was funded by Gene Solutions. The sponsor has no role in the analysis of the data and the preparation of the manuscript. * cfDNA : Cell free DNA TOO : Tissue-of-origin TSMA : tumor-specific methylation atlas WBC : white blood cells NNLS : non-negative least square matrix factorization GCNN : graph convolutional neural network GWMD : genome-wide methylation density TMD : targeted region methylation density GWFP : genome-wide fragmentation profile EM : end-motif CNA : copy number aberration
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要