Understanding life sciences data curation practices via user research

F1000Research(2019)

引用 2|浏览23
暂无评分
摘要
Background: Manual curation is a cornerstone of public biological data resources. However, it is a time-consuming process that urgently needs supportive technical solutions in the face of rapid data growth. Supporting scalable curation is a part of the mission of the Elixir Data Platform. Thus far, we have established infrastructure capable of ingesting and aggregating text-mined outputs from multiple providers and making these available via an API. This public API is used by Europe PMC to display specific entities and relationships on full text articles (via the SciLite application). Methods: To ensure that the future development of this infrastructure meets the needs of curators, we carried out a user research project to understand and identify common workflow patterns and practices via an observational study. Building on these outcomes, we then devised a curator community survey to more specifically understand which entity types, sections of a paper and tools are of top priority to address. Results: The main challenges faced by curators included the following: a) There is a need for ways to prioritise and identify relevant papers for curation as the volume of literature is large; b) Finding specific information can prove difficult; quick ways of filtering articles based on specific entities, such as experimental methods, species and other important entities, such as genes, cell lines and tissue samples, are required; and c) Transferring information from the search/annotation tools to the various curation workflows was also challenging. Conclusions: This study lays the foundation for identifying actionable items to orient the current infrastructure towards meeting the needs of curation community, by improving text-mined annotation quality and coverage and other engineering solutions; and reusing text-mined annotations and other metadata in Europe PMC for article triage. Furthermore, this study presents an opportunity to explore customisation of triage/ranking systems to suit different curation contexts.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要