Tanimoto Random Features for Scalable Molecular Machine Learning

NeurIPS(2023)

引用 0|浏览28
暂无评分
摘要
The Tanimoto coefficient is commonly used to measure the similarity between molecules represented as discrete fingerprints, either as a distance metric or a positive definite kernel. While many kernel methods can be accelerated using random feature approximations, at present there is a lack of such approximations for the Tanimoto kernel. In this paper we propose two kinds of novel random features to allow this kernel to scale to large datasets, and in the process discover a novel extension of the kernel to real vectors. We theoretically characterize these random features, and provide error bounds on the spectral norm of the Gram matrix. Experimentally, we show that the random features proposed in this work are effective at approximating the Tanimoto coefficient in real-world datasets and that the kernels explored in this work are useful for molecular property prediction and optimization tasks.
更多
查看译文
关键词
machine learning,molecular,features,scalable
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要