Stacked RNNs for Encoder-Decoder Networks : Accurate Machine Understanding of Images

semanticscholar(2016)

引用 0|浏览7
暂无评分
摘要
We address the image captioning task by combining a convolutional neural network (CNN) with various recurrent neural network architectures. We train the models on over 400,000 training examples ( roughly 80,000 images, with 5 captions per image) from the Microsoft 2014 COCO challenge. We demonstrate that stacking a 2-Layer RNN provides better results on image captioning tasks than both a Vanilla LSTM and a Vanilla RNN.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要