MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images

CoRR（2023）

Cited 3|Views40

Key words

Visual Question Answering,Multimodal Fusion,Image Captioning,Meta-Learning,Language Understanding

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined