Multi-Timestep-Ahead Prediction with Mixture of Experts for Embodied Question Answering

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI(2023)

引用 0|浏览0
暂无评分
摘要
In this study, we propose a method that integrates visual field predictions with different time scales and investigates its effectiveness for embodied question answering (EQA). In EQA, it is desirable to be able to automatically select a prediction time scale according to the situation, as the path to the target object depends on the instructions provided. However, previous studies have only investigated subtask learning with a limited prediction timescale and target. We propose a mixed expert model in which multiple expert networks predict future images at different time steps, and a higher-level gating network estimates the distribution of each experts output. By sequentially adjusting the output of the expert network, the proposed method enables robot navigation considering multi-timestep-ahead prediction. Comparison experiments on the EQA MP3D dataset show that the proposed method improves the prediction accuracy of the model regardless of the distance to the target.
更多
查看译文
关键词
Embodied Question Answering,Mixture of Experts,Multi-step Ahead Prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要