中医药临床随机对照试验文献结构化信息的自动化提取及信息质量评价
China Medical Herald(2023)
北京中医药大学
Abstract
目的 为提高中医药临床随机对照试验(RCTs)文献中数据信息的利用率,本研究对纳入文献中存在的结构化信息进行自动化提取并对提取到的信息进行评价.方法 对 1986 年 1 月至 2020 年 12 月中国知网、万方数据库和维普网中糖尿病、类风湿性关节炎、肥胖、膝骨关节炎、小儿腹泻、结直肠癌 6 个病种的中医药临床RCTs文献进行检索及梳理,随机纳入 5 506 篇,运用光学字符识别技术对可携带文档格式的文献进行识别,转化成文本格式,并使用正则表达式对文献信息进行提取.从信息的提取率和准确率两方面进行评价.结果 研究发现"资料""方法 ""试验参与者总数""试验参与者年龄""试验参与者例数""疗程天数""排除标准""纳入标准"和"基金"9 个字段的提取率分别为 96.60%、93.30%、92.60%、42.23%、28.29%、80.20%、62.60%、46.00%、21.10%,9 个字段的准确率分别为97.9%、98.9%、89.7%、100.0%、100.0%、94.5%、97.3%、89.0%、94.7%.结论 中医药临床RCTs文献可以通过自动化方式对文献结构化信息进行完整性的识别与判断,提取出的结构化信息可以为中医药临床RCTs网络体系搭建提供数据支撑,在此基础上提出了中医药临床RCTs文献结构化写作设想.
MoreTranslated text
Key words
Traditional Chinese medicine,Randomized controlled trial,Optical character recognition,Structured,Scientific writing
求助PDF
上传PDF
View via Publisher
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
- Pretraining has recently greatly promoted the development of natural language processing (NLP)
- We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
- We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
- The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
- Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Upload PDF to Generate Summary
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
Summary is being generated by the instructions you defined