Medical visual question answering

Zhibin Liao,Anton van den Hengel,Johan W. Verjans

Intelligence-Based Cardiology and Cardiac Surgery（2024）

引用 0|浏览3

暂无评分

摘要

Visual question answering (VQA) enables computers to answer new, open-ended questions in real-time given input imaging. It represents a particularly challenging task within machine learning (ML), not least because it requires answering previously unseen questions about previously unseen images. In this sense it addresses one of the core difficulties in designing ML systems to assist medical professionals, which is that it is impossible to predict the problems they will need to solve tomorrow. Traditional ML assumes that the problem to be solved is indeed predictable years in advance, to allow for the collection and labeling of the appropriate training data, and the training of a suitable model. VQA, in contrast, allows the question to be specified live, at test time. It might thus be used to enable a clinician to ask questions about the literature based on the specifics of the patient in front of them. It might enable a radiologist to ask whether a specified feature of a recent scan is comparable to anything in the database for patients of a particular demographic, or a researcher to use the latest public health records to seek immediate insight into an emerging global health challenge. In each case there is an opportunity to help a medical professional in real time to achieve a better outcome using ML to answer a question that wasn't foreseeable.

查看译文

关键词

medical visual question

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要