A Strategic Approach to Machine Learning for Material Science: How to Tackle Real-World Challenges and Avoid Pitfalls

CHEMISTRY OF MATERIALS(2022)

引用 7|浏览19
暂无评分
摘要
The exponential growth and success of machine learning (ML) has resulted in its application in all scientific domains including material science. Advancement in experimental techniques has led to an increase in the volume of material science data encouraging material scientists to investigate data-driven solutions to scientific problems. While the resources available to get started with ML are ever increasing, there is little literature on traversing through the space of decisions that need to be made to implement a robust and trustworthy ML solution. A lack of such resources leads to researchers wading through articles and papers trying to determine the best approach for their problem and sometimes also falling prey to pitfalls in a real-world scenario. This paper aims to act as a guide for researchers who want to strategically approach a ML solution to their problem through the use of domain knowledge and systematic evaluation of the major aspects of a ML pipeline. We focus on four aspects of the ML pipeline: (1) problem formulation, (2) data curation, (3) feature representation and model selection, and (4) model generalizability and real-world performance. In each case, we discuss the space of decisions, provide examples from scientific literature, and illustrate how different choices can affect the outcome through a case study of predicting compressive strength of uniaxially pressed molecular solid, 2,4,6-triamino-1,3,5-trinitrobenzene (TATB) samples. Using a similar approach of critical thinking along with rigorous evaluation and diagnostics, researchers can be assured of the reliability of predictions from their ML models.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要