Multimodal Healthcare AI: Identifying and Designing Clinically Relevant Vision-Language Applications for Radiology
CoRR(2024)
摘要
Recent advances in AI combine large language models (LLMs) with vision
encoders that bring forward unprecedented technical capabilities to leverage
for a wide range of healthcare applications. Focusing on the domain of
radiology, vision-language models (VLMs) achieve good performance results for
tasks such as generating radiology findings based on a patient's medical image,
or answering visual questions (e.g., 'Where are the nodules in this chest
X-ray?'). However, the clinical utility of potential applications of these
capabilities is currently underexplored. We engaged in an iterative,
multidisciplinary design process to envision clinically relevant VLM
interactions, and co-designed four VLM use concepts: Draft Report Generation,
Augmented Report Review, Visual Search and Querying, and Patient Imaging
History Highlights. We studied these concepts with 13 radiologists and
clinicians who assessed the VLM concepts as valuable, yet articulated many
design considerations. Reflecting on our findings, we discuss implications for
integrating VLM capabilities in radiology, and for healthcare AI more
generally.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要