D²TV: Dual Knowledge Distillation and Target-oriented Vision Modeling for Many-to-Many Multimodal Summarization.Yunlong Liang,Fandong Meng,Jiaan Wang,Jinan Xu,Yufeng Chen,Jie ZhouCoRR(2023)引用 0|浏览50暂无评分AI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要