Data driven analysis of aromatase inhibitors through machine learning, database mining and library generation

CHEMICAL PHYSICS(2024)

引用 0|浏览0
暂无评分
摘要
Designing of novel drugs using data-driven and virtual screening approaches is a popular research topic in the pharmaceutical industry. Machine learning (ML) and data mining have recently emerged as useful tools for finding potent compounds and predicting their biological activities. In this study, data was collected from academic research articles to train ML models. Molecular descriptors were utilized for training over forty ML models. The best two models (Decision Tree regressor and Extra Tree regressor) were selected based on statistical parameters, and their hyperparameters were optimized to identify the best compounds with high pIC50 values for aromatase inhibitors. A database of more than 5000 compounds was extracted from PubChem, and the best ML model was used to predict their aromatase inhibition values. The top three reference compounds from the database were elected, and new compounds were designed using the library enumeration methodology. The two best ML models (Decision Tree regressor and Extra Tree regressor) were able to accurately predict the aromatase inhibition values of the compounds in our database. In conclusion, our study shows that data-driven and virtual screening approaches using machine learning and data mining can be used to design novel molecules as drugs, specifically in the case of aromatase inhibitors. The results of our study have the potential to contribute significantly to the field of pharmaceutical research and development.
更多
查看译文
关键词
Aromatase inhibitors,Drug design,Library enumeration,Data mining,Machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要