Person Re-Identification with Vision and Language

2018 24th International Conference on Pattern Recognition (ICPR)(2017)

引用 2|浏览32
暂无评分
摘要
In this paper we propose a new approach to person re-identification using images and natural language descriptions. We propose a joint vision and language model based on CCA and CNN architectures to match across the two modalities as well as to enrich visual examples for which there are no language descriptions. We also introduce new annotations in the form of natural language descriptions for two standard Re-ID benchmarks, namely CUHK03 and VIPeR. We perform experiments on these two datasets with techniques based on CNN, hand-crafted features as well as LSTM for analysing visual and natural description data. We investigate and demonstrate the advantages of using natural language descriptions compared to attributes as well as CNN compared to LSTM in the context of Re-ID. We show that the joint use of language and vision can significantly improve the state-of-the-art performance on standard Re-ID benchmarks.
更多
查看译文
关键词
natural language descriptions,language model,CNN,natural description data,person re-identification,visual description data,Re-ID benchmarks,LSTM architectures,CUHK03,VIPeR
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要