Cross-Situational Word Learning in Disentangled Latent Space

2023 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, ICDL(2023)

引用 0|浏览0
暂无评分
摘要
Cross-situational word learning (CSL) is a fast and efficient method for humans to acquire word meanings. Many studies have replicated human CSL using computational models. Among these, cross-situational learning with Bayesian probabilistic generative model (CSL-PGM) can estimate word meanings from observations that include multiple attributes, such as color and shape. However, as CSL-PGM receives observations for each attribute on a separate channel, it cannot perform CSL for images with multiple attributes. Therefore, we introduce a disentangled representation that captures the attributes within an image. Additionally, we propose CSL+VAE, which integrates CSL-PGM and a beta-VAE to obtain a disentangled representation in an unsupervised manner. CSL+VAE can discover attributes hidden in images and word sequences and infer the meanings of words. Additionally, it can obtain a more disentangled representation using a learning framework wherein both models share parameters. During experiments, the model was trained on a set of images comprising five attributes and one to five words describing them. The results showed that 99.9% of the words correctly estimated the attributes of the words and correctly estimated the correspondence between the image features and the words. The proposed model also outperformed existing multimodal models in inferring images from word sequences, achieving an accuracy of 0.870.
更多
查看译文
关键词
cross-situational word learning,disentanglement,variational autoencoder
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要