Comparison of two methods for unsupervised person identification in TV shows

Content-Based Multimedia Indexing(2014)

引用 14|浏览4
暂无评分
摘要
We address the task of identifying people appearing in TV shows. The target persons are all people whose identity is said or written, like the journalists and the well known people, as politicians, athletes, celebrities, etc. In our approach, overlaid names displayed on the images are used to identify the persons without any use of biometric models for the speakers and the faces. Two identification methods are evaluated as part of the REPERE French evaluation campaign. The first one relies on co-occurrence times between overlay person names and speaker/face clusters, and rule-based decisions which assign a name to each monomodal cluster. The second method uses a Conditionnal Random Field (CRF) which combine different types of co-occurrence statistics and pair-wised constraints to jointly identify speakers and faces.
更多
查看译文
关键词
face recognition,image matching,speaker recognition,statistical analysis,video signal processing,CRF,REPERE French evaluation campaign,TV shows,biometric models,co-occurrence statistics,conditionnal random field,monomodal cluster,overlay person names,pair-wised constraints,people identification,rule-based decisions,speaker-face clusters,television shows,unsupervised person identification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要