Enhanced sampling of robust molecular datasets with uncertainty-based collective variables
CoRR(2024)
摘要
Generating a data set that is representative of the accessible configuration
space of a molecular system is crucial for the robustness of machine learned
interatomic potentials (MLIP). However, the complexity of molecular systems,
characterized by intricate potential energy surfaces (PESs) with numerous local
minima and energy barriers, presents a significant challenge. Traditional
methods of data generation, such as random sampling or exhaustive exploration,
are either intractable or may not capture rare, but highly informative
configurations. In this study, we propose a method that leverages uncertainty
as the collective variable (CV) to guide the acquisition of chemically-relevant
data points, focusing on regions of the configuration space where ML model
predictions are most uncertain. This approach employs a Gaussian Mixture
Model-based uncertainty metric from a single model as the CV for biased
molecular dynamics simulations. The effectiveness of our approach in overcoming
energy barriers and exploring unseen energy minima, thereby enhancing the data
set in an active learning framework, is demonstrated on the alanine dipeptide
benchmark system.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要