A Novel Deep Multi-Modal Feature Fusion Method for Celebrity Video Identification

Proceedings of the 27th ACM International Conference on Multimedia(2019)

引用 4|浏览75
暂无评分
摘要
In this paper, we develop a novel multi-modal feature fusion method for the 2019 iQIYI Celebrity Video Identification Challenge, which is held in conjunction with ACM MM 2019. The purpose of this challenge is to retrieve all the video clips of a given identity in the testing set. In this challenge, the multi-modal features of a celebrity are encouraged to be combined for a promising performance, such as face features, head features, body features, and audio features. As we know, the features from different modalities usually have their own influences on the results. To achieve better results, a novel weighted multi-modal feature fusion method is designed to obtain the final feature representation. After many experimental verification, we found that different feature fusion weights for training and testing make the method robust to multi-modal person identification. Experiments on the iQIYI-VID-2019 dataset show that our multi-modal feature fusion strategy effectively improves the accuracy of person identification. Specifically, for competition, we use a single model to get the result of 0.8952 in mAP, which ranks TOP-5 among all the competitive results.
更多
查看译文
关键词
multi-modal feature fusion, person identification, video identification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要