WeChat Mini Program
Old Version Features

Enabling Scientific Reproducibility Through FAIR Data Management: an Ontology-Driven Deep Learning Approach in the NeuroBridge Project.

AMIA Annual Symposium proceedings AMIA Symposium(2023)

Pennsylvania State University

Cited 3|Views24
Abstract
Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated search platform called NeuroBridge to enable researchers to search for relevant study datasets that can be used to test a hypothesis or replicate a published finding without having to perform a difficult search from scratch, including contacting individual study authors and locating the site to download the data. In this paper, we describe the development of a metadata ontology based on the World Wide Web Consortium (W3C) PROV specifications to create a corpus of semantically annotated published papers. This annotated corpus was used in a deep learning model to support automated identification of candidate datasets related to neurocognitive assessment of subjects with drug abuse or schizophrenia using neuroimaging. We built on our previous work in the Provenance for Clinical and Health Research (ProvCaRe) project to model metadata information in the NeuroBridge ontology and used this ontology to annotate 51 articles using a Web-based tool called Inception. The Bidirectional Encoder Representations from Transformers (BERT) neural network model, which was trained using the annotated corpus, is used to classify and rank papers relevant to five research hypotheses and the results were evaluated independently by three users for accuracy and recall. Our combined use of the NeuroBridge ontology together with the deep learning model outperforms the existing PubMed Central (PMC) search engine and manifests considerable trainability and transparency compared with typical free-text search. An initial version of the NeuroBridge portal is available at: https://neurobridges.org/.
More
Translated text
Key words
Scientific Workflows,Computational Research,Neuronal Morphology,Data Provenance,Deep Learning
求助PDF
上传PDF
Bibtex
收藏
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
  • Pretraining has recently greatly promoted the development of natural language processing (NLP)
  • We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
  • We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
  • The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
  • Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Upload PDF to Generate Summary
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出了一种基于本体论驱动的深度学习方法,旨在通过FAIR数据管理提高科学研究可再现性,特别是在神经科学领域,通过NeuroBridge平台实现研究数据的高效检索和利用。

方法】:研究采用W3C PROV规范构建元数据本体论,并利用该本体论对已发表的论文进行语义标注,进而训练BERT神经网络模型以自动识别与神经认知评估相关的数据集。

实验】:研究使用Inception工具对51篇文章进行了本体论标注,并利用标注后的语料库训练BERT模型。通过三个独立用户对模型分类和排序的相关论文进行准确性及召回率的评估。实验结果表明,结合NeuroBridge本体论和深度学习模型的方法在性能上超过了现有的PubMed Central搜索引擎,并具有较好的训练性和透明度。数据集名称未明确提及。