Syntax-Guided Transformers: Elevating Compositional Generalization and Grounding in Multimodal Environments.
CoRR(2023)
Abstract
Compositional generalization, the ability of intelligent models to
extrapolate understanding of components to novel compositions, is a fundamental
yet challenging facet in AI research, especially within multimodal
environments. In this work, we address this challenge by exploiting the
syntactic structure of language to boost compositional generalization. This
paper elevates the importance of syntactic grounding, particularly through
attention masking techniques derived from text input parsing. We introduce and
evaluate the merits of using syntactic information in the multimodal grounding
problem. Our results on grounded compositional generalization underscore the
positive impact of dependency parsing across diverse tasks when utilized with
Weight Sharing across the Transformer encoder. The results push the
state-of-the-art in multimodal grounding and parameter-efficient modeling and
provide insights for future research.
MoreTranslated text
Key words
elevating compositional generalization,transformers,grounding,environments,syntax-guided
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined