Deep Metric Learning with Improved Triplet Loss for Face Clustering in Videos.

PCM(2016)

引用 31|浏览38
暂无评分
摘要
Face clustering in videos is to partition a large amount of faces into a given number of clusters, such that some measure of distance is minimized within clusters and maximized between clusters. In real-world videos, head pose, facial expression, scale, illumination, occlusion and some uncontrolled factors may dramatically change the appearance variations of faces. In this paper, we tackle this problem by learning non-linear metric function with a deep convolutional neural network from the input image to a low-dimensional feature embedding with the visual constraints among face tracks. Our network directly optimizes the embedding space so that the Euclidean distances correspond to a measure of semantic face similarity. This is technically realized by minimizing an improved triplet loss function, which pushes the negative face away from the positive pairs, and requires the distance of the positive pair to be less than a margin. We extensively evaluate the proposed algorithm on a set of challenging videos and demonstrate significant performance improvement over existing techniques.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要