CParty: Conditional partition function for density-2 RNA pseudoknots

bioRxiv (Cold Spring Harbor Laboratory)(2023)

Abstract RNA molecules fold into biologically important functional structures. Efficient dynamic programming RNA (secondary) structure prediction algorithms restrict the the search space to evade NP-hardness of general pseudoknot prediction. While such prediction algorithms can be extended to provide a stochastic view on RNA ensembles, they are either limited to pseudoknot-free structures or extremely complex. To overcome this dilemma, we follow the hierarchical folding hypothesis, i.e. the bio-physically well-motivated assumption that non-crossing structures fold relatively fast prior to the formation of pseudoknot interactions. Thus, we efficiently compute the conditional partition function (CPF) given a non-crossing structure G for a subset of pseudoknotted structures i.e. density-2 structures G ∪ G ′ for non-crossing disjoint G ′. Notably, this enables sampling from the hierarchical distribution P ( G ′| G ). As our main contribution, we devise the algorithm CParty , which transfers the dynamic programming scheme of HFold (which minimizes free energy of pseudoknots in a realistic model) to a partition function variant by for the first time de-ambiguating its decomposition of density-2 structures. Compared to the only other available pseudoknot partition function algorithm, which covers simple pseudoknots, our method covers a much larger structure class; at the same time, it is significantly more efficient—reducing the time as well as the space complexity by a quadratic factor. Summarizing, we provide a highly efficient, cubic time, algorithm for the stochastic analysis of pseudoknotted RNAs, which enables novel applications. For example, we discuss how the CPF for a pseudoknotted therapeutic target in SARS-CoV-2 provides insights into RNA structure formation kinetic paths. 2012 ACM Subject Classification Applied computing → Computational biology; Theory of computation → Dynamic programming Digital Object Identifier 10.4230/LIPIcs.WABI.2023.23 Funding Hosna Jabbari : [NSERC DG, Microsoft AI for Health]
rna,conditional partition function
