A Dataset for Pharmacovigilance in German, French, and Japanese: Annotating Adverse Drug Reactions across Languages
arxiv(2024)
摘要
User-generated data sources have gained significance in uncovering Adverse
Drug Reactions (ADRs), with an increasing number of discussions occurring in
the digital world. However, the existing clinical corpora predominantly revolve
around scientific articles in English. This work presents a multilingual corpus
of texts concerning ADRs gathered from diverse sources, including patient fora,
social media, and clinical reports in German, French, and Japanese. Our corpus
contains annotations covering 12 entity types, four attribute types, and 13
relation types. It contributes to the development of real-world multilingual
language models for healthcare. We provide statistics to highlight certain
challenges associated with the corpus and conduct preliminary experiments
resulting in strong baselines for extracting entities and relations between
these entities, both within and across languages.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要