Rethinking Multi-view Representation Learning via Distilled Disentangling
CVPR 2024(2024)
摘要
Multi-view representation learning aims to derive robust representations that
are both view-consistent and view-specific from diverse data sources. This
paper presents an in-depth analysis of existing approaches in this domain,
highlighting a commonly overlooked aspect: the redundancy between
view-consistent and view-specific representations. To this end, we propose an
innovative framework for multi-view representation learning, which incorporates
a technique we term 'distilled disentangling'. Our method introduces the
concept of masked cross-view prediction, enabling the extraction of compact,
high-quality view-consistent representations from various sources without
incurring extra computational overhead. Additionally, we develop a distilled
disentangling module that efficiently filters out consistency-related
information from multi-view representations, resulting in purer view-specific
representations. This approach significantly reduces redundancy between
view-consistent and view-specific representations, enhancing the overall
efficiency of the learning process. Our empirical evaluations reveal that
higher mask ratios substantially improve the quality of view-consistent
representations. Moreover, we find that reducing the dimensionality of
view-consistent representations relative to that of view-specific
representations further refines the quality of the combined representations.
Our code is accessible at: https://github.com/Guanzhou-Ke/MRDD.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要