Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech

ICLR, 2020.

Cited by: 22|Views76
EI
Weibo:
We demonstrated that these units are far more robust to noise and domain shift than units derived from previously proposed models. These results supported the notion that semantic supervision via a discriminative, multimodal grounding objective has the potential to be more powerf...

Abstract:

In this paper, we present a method for learning discrete linguistic units by incorporating vector quantization layers into neural models of visually grounded speech. We show that our method is capable of capturing both word-level and sub-word units, depending on how it is configured. What differentiates this paper from prior work on speec...More

Code:

Data:

Your rating :
0

 

Tags
Comments