Prediction Of Blood-Brain Barrier Permeability Of Compounds By Fusing Resampling Strategies And Extreme Gradient Boosting

IEEE ACCESS(2021)

引用 10|浏览4
暂无评分
摘要
Computer-aided drug design is an efficient method to analyze the development of disease-related drugs. However, developed as binding targets, medicines perform well in cell models and animal models but fail in human models. One main reason for this failure is that the human body has natural barriers, such as the blood-brain barrier, to block exogenous macromolecules. Thus, efficient and accurate predictions of drug molecules that can effectively pass the blood-brain barrier is necessary in developing drug treatments for brain tissue diseases. In this study, 7658 molecular structure features were extracted from 2354 drug molecule SMILE strings using computational methods. By integrating three feature selection algorithms of machine learning, 33 chemical structure features with significantly discriminant performance were screened out and used to construct multiple discriminant models. After a comprehensive comparison, the XGBoost model was selected as the final prediction model. After data preprocessing and parameter optimization, the model achieved 95% accuracy on the training set. To verify the model's stability, we introduced an external data set, which reached 96% accuracy of the model. This study applies new resampling methods and machine learning algorithms, and adjusts the application of resampling methods to obtain new chemical features to construct machine learning predictors. The features may contribute to the significant drug development that integrates biological analysis and machine learning algorithms.
更多
查看译文
关键词
Drugs, Feature extraction, Computational modeling, Compounds, Support vector machines, Data models, Predictive models, Blood-brain barrier, data imbalanced, machine learning, eXtreme Gradient Boosting (XGBoost), computational biology, resample methods
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要