ePhenotyping for Abdominal Aortic Aneurysm in the Electronic MedicalRecords and Genomics (eMERGE) Network: Algorithm Development andKonstanz Information Miner Workflow

International journal of biomedical data mining(2015)

引用 12|浏览48
暂无评分
摘要
Background and objective: We designed an algorithm to identify abdominal aortic aneurysm cases and controls from electronic health records to be shared and executed within the “electronic Medical Records and Genomics” (eMERGE) Network. Materials and methods: Structured Query Language, was used to script the algorithm utilizing “Current Procedural Terminology” and “International Classification of Diseases” codes, with demographic and encounter data to classify individuals as case, control, or excluded. The algorithm was validated using blinded manual chart review at three eMERGE Network sites and one non-eMERGE Network site. Validation comprised evaluation of an equal number of predicted cases and controls selected at random from the algorithm predictions. After validation at the three eMERGE Network sites, the remaining eMERGE Network sites performed verification only. Finally, the algorithm was implemented as a workflow in the Konstanz Information Miner, which represented the logic graphically while retaining intermediate data for inspection at each node. The algorithm was configured to be independent of specific access to data and was exportable (without data) to other sites. Results: The algorithm demonstrated positive predictive values (PPV) of 92.8% (CI: 86.8-96.7) and 100% (CI: 97.0-100) for cases and controls, respectively. It performed well also outside the eMERGE Network. Implementation of the transportable executable algorithm as a Konstanz Information Miner workflow required much less effort than implementation from pseudo code, and ensured that the logic was as intended. Discussion and conclusion: This ePhenotyping algorithm identifies abdominal aortic aneurysm cases and controls from the electronic health record with high case and control PPV necessary for research purposes, can be disseminated easily, and applied to high-throughput genetic and other studies.
更多
查看译文
关键词
artificial intelligence,case control study,omics,data mining,computational biology,bioinformatics,biomedical,medical informatics,computer science,protein sequencing,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要