Improving Radiology Report Generation with D2-Net: When Diffusion Meets Discriminator

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

Cited 0|Views18
No score
Abstract
Radiology report generation (RRG) aims to automatically provide observations and insight into a patient’s condition based on radiology images, which is able to greatly reduce the workload of physicians on the premise of ensuring the quality of medical treatment. Existing works leverage the Transformer decoder to generate reports word-by-wordly. However, unlike image captioning, radiology reports are long text containing many semantic words. The autoregressive method, such as the Transformer-base method, will accumulate errors in the generation process and generate unsatisfied reports. Benefiting from the recent success of Diffusion, we propose a novel Diffusion-based paradigm for RRG, which leverages visual information as a condition, making the generation process focus on pathological features within the radiology image. Meanwhile, we integrate a discriminator into each layer of the Diffusion to actively judge whether the generated words are meaningful, which, on the one hand, controls the length of predicted reports and, on the other hand, calibrates confidence scores and token generation results, improving the quality of the generated reports. Extensive experiment results demonstrate the superiority of our proposed method. Source code is available at: https://github.com/Yuda-Jin/D-2-Net.
More
Translated text
Key words
Radiology Reports,Medical Imaging,Image Captioning,Semantic Word,Autoregressive Method,Transformer Decoder,Computational Cost,Chest X-ray,Visual Features,Autoregressive Model,Attention Mechanism,Reversible Process,Diffusion Model,Learnable Parameters,Hidden State,Inference Time,Decoder Layer,Text Generation,Transformer Encoder,Chest X-ray Images
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined