Privileged Modality Learning via Multimodal Hallucination

IEEE TRANSACTIONS ON MULTIMEDIA(2024)

引用 0|浏览0
暂无评分
摘要
Learning based on multimodal data has attracted increasing interest recently. While a variety of sensory modalities can be collected for training, not all of them are always available in practical scenarios, which raises the challenge to infer with incomplete modality. This article presents a general framework termed multimodal hallucination (MMH) to bridge the gap between ideal training scenarios and real-world deployment scenarios with incomplete modality data by transferring the complete multimodal knowledge to the hallucination network with incomplete modality input. Compared with the modality hallucination methods that restore privileged modalities information for late fusion, the proposed framework not only helps to preserve the crucial cross-modal cues but relates the study in complete modalities and in incomplete modalities. Then, we introduce two strategies called region-aware distillation and discrepancy-aware distillation to transfer the response-based and joint-representation-based knowledge of pre-trained multimodal networks, respectively. Region-aware distillation establishes and weights knowledge transferring pipelines between the response of multimodal and hallucination networks at multiple regions, which guides the hallucination network to focus on discriminative regions and avoid wasted gradients. Discrepancy-aware distillation guides the hallucination network to mimic the local inter-sample distance of multimodal representations, which enables the hallucination network to acquire the inter-class discrimination refined by multimodal cues. Extensive experiments on multimodal action recognition and face anti-spoofing demonstrate the proposed multimodal hallucination framework can overcome the problem of incomplete modality input in various scenes and achieve state-of-the-art performance.
更多
查看译文
关键词
Privileged modality,incomplete modality,multimodal hallucination,knowledge distillation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要