Petagraph: A large-scale unifying knowledge graph framework for integrating biomolecular and biomedical data

Benjamin J. Stear,Taha Mohseni Ahooyi, Shubha Vasisht, Alan Simmons, Katherine Beigel,Tiffany J Callahan,Jonathan C. Silverstein,Deanne M Taylor

biorxiv(2023)

引用 0|浏览30
暂无评分
摘要
The use of biomedical knowledge graphs (BMKG) for knowledge representation and data integration has increased drastically in the past several years due to the size, diversity, and complexity of biomedical datasets and databases. Data extraction from a single dataset or database is usually not particularly challenging. However, if a scientific question must rely on analysis across multiple databases or datasets, it can often take many hours to correctly and reproducibly extract and integrate data towards effective analysis. To overcome this issue, we created Petagraph, a large-scale BMKG that integrates biomolecular data into a schema incorporating the Unified Medical Language System (UMLS). Petagraph is instantiated on the Neo4j graph platform, and to date, has fifteen integrated biomolecular datasets. The majority of the data consists of entities or relationships related to genes, animal models, human phenotypes, drugs, and chemicals. Quantitative data sets containing values from gene expression analyses, chromatin organization, and genetic analyses have also been included. By incorporating models of biomolecular data types, the datasets can be traversed with hundreds of ontologies and controlled vocabularies native to the UMLS, effectively bringing the data to the ontologies. Petagraph allows users to analyze relationships between complex multi-omics data quickly and efficiently. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
unifying knowledge petagraph framework,biomedical data,petagraph framework,large-scale
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要