An Industrial Approach to Using Artificial Intelligence and Natural Language Processing for Accelerated Document Preparation in Drug Development

JOURNAL OF PHARMACEUTICAL INNOVATION(2020)

引用 6|浏览333
暂无评分
摘要
Purpose Due to the exceptionally high standards for accuracy and data integrity in scientific regulatory reporting, it is vital that any tool that aims to streamline this process is as efficient or more in gathering data as a team of scientists, without higher cost in terms of time or resources. For this reason, an artificial intelligence-based tool with parallel search, document creation, and data integrity review capabilities is being investigated as a potential solution. This paper describes a proof of concept project to develop an AI-based tool to rapidly assemble an end-of-phase 2 (EOP2) briefing document for a potential medicine. We have called the tool an Intelligent Machine for Document Preparation or IMDP. Methods A training corpus of approximately 65,000 pdf documents derived from electronic lab notebooks and technical reports related to five molecules (including Merestinib) was ingested, and prior EOP2 documents from the remaining four molecules was used to generate training questions and answers. Then, an annotation-light natural language processing algorithm analyzed a set of structured and unstructured data regarding Merestinib. A simple user interface was created allowing scientists to query the system in natural language, and a table builder, image/plot finder, and free-text addition features were added to allow for advanced search without dependence on keywords. Results Three significant innovations were designed-in to improve overall performance as compared to our benchmark solution without sacrificing usability. First, the AI-based IMDP was built to improve accuracy and accelerate document creation with remarkably low amount of training. Second, image search capability was added to enrich the knowledge base, and third, the IMDP was integrated with the existing process rather than adding a step in the workflow. Finally, accuracy and total document creation time were compared with the existing tool (benchmark tool). Our experiments show that the AI-based technology reached 89% accuracy which surpassed the internal benchmark of 54% and retrieved the right information 3.6 times faster. Conclusions The main contribution of this study is to show the value of artificial intelligence-based tools in accelerating all major stages of regulatory report creation while allowing a team of scientists to seamlessly collaborate.
更多
查看译文
关键词
Artificial intelligence, Natural language processing, Pharmaceutical development, Unstructured data search, Image search, Documentation preparation, Image analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要