Interpretable Online Network Dictionary Learning for Inferring Long-Range Chromatin Interactions
arxiv(2023)
摘要
Dictionary learning (DL) is commonly used in computational biology to tackle
ubiquitous clustering problems due to its conceptual simplicity and relatively
low computational complexity. However, DL algorithms produce results that lack
interpretability and are not optimized for large-scale graph-structured data.
We propose a novel DL algorithm called online convex network dictionary
learning (online cvxNDL) that can handle extremely large datasets and enables
the interpretation of dictionary elements, which serve as cluster
representatives, through convex combinations of real measurements. Moreover,
the algorithm can be applied to network-structured data via specialized
subnetwork sampling techniques.
To demonstrate the utility of our approach, we apply cvxNDL on 3D-genome
RNAPII ChIA-Drop data to identify important long-range interaction patterns.
ChIA-Drop probes higher-order interactions, and produces hypergraphs whose
nodes represent genomic fragments. The hyperedges represent observed physical
contacts. Our hypergraph model analysis creates an interpretable dictionary of
long-range interaction patterns that accurately represent global chromatin
physical contact maps. Using dictionary information, one can also associate the
contact maps with RNA transcripts and infer cellular functions.
Our results offer two key insights. First, we demonstrate that online cvxNDL
retains the accuracy of classical DL methods while simultaneously ensuring
unique interpretability and scalability. Second, we identify distinct
collections of proximal and distal interaction patterns involving chromatin
elements shared by related processes across different chromosomes, as well as
patterns unique to specific chromosomes. To associate the dictionary elements
with biological properties of the corresponding chromatin regions, we employ
Gene Ontology enrichment analysis and perform RNA coexpression studies.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要