RCDPeaks: memory-efficient density peaks clustering of long molecular dynamics

BIOINFORMATICS(2022)

引用 2|浏览16
暂无评分
摘要
Motivation: Density Peaks is a widely spread clustering algorithm that has been previously applied to Molecular Dynamics (MD) simulations. Its conception of cluster centers as elements displaying both a high density of neighbors and a large distance to other elements of high density, particularly fits the nature of a geometrical converged MD simulation. Despite its theoretical convenience, implementations of Density Peaks carry a quadratic memory complexity that only permits the analysis of relatively short trajectories. Results: Here, we describe DP+, an exact novel implementation of Density Peaks that drastically reduces the RAM consumption in comparison to the scarcely available alternatives designed for MD. Based on DP+, we developed RCDPeaks, a refined variant of the original Density Peaks algorithm. Through the use of DP+, RCDPeaks was able to cluster a one-million frames trajectory using less than 4.5 GB of RAM, a task that would have taken more than 2 TB and about 3x more time with the fastest and less memory-hunger alternative currently available. Other key features of RCDPeaks include the automatic selection of parameters, the screening of center candidates and the geometrical refining of returned clusters. Availability and implementation: The source code and documentation of RCDPeaks are free and publicly available on GitHub (https://github.com/LQCT/RCDPeaks.git). Contact: roy_gonzalez@fq.uh.cu or daniel.platero@fq.uh.cu Supplementary information: Supplementary data are available at Bioinformatics online.
更多
查看译文
关键词
molecular,clustering,dynamics,density,memory-efficient
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要