Designing Practical End-to-End System for Soft Biometric-Based Person Retrieval From Surveillance Videos
IEEE ACCESS(2023)
摘要
Video surveillance improves public safety by preventing and sensing criminal activity, enhancing quick counteractions, and presenting evidence to investigators. This is effectively performed by firing a natural language query containing soft biometrics to retrieve a person from a video. State-of-the-art (SOTA) approaches focus on improving retrieval results; thus, the building blocks of any person retrieval system are not accorded due attention, putting novice researchers at a disadvantage. This study aims to provide a design methodology by showcasing the block-by-block construction of a person retrieval system using video and natural language. For each subsystem - natural language processing, person detection, attribute recognition, and ranking- we discuss the available design selections, provide empirical evidence, and discuss bottlenecks and solutions. We thereafter select and integrate the best choices to create an end-to-end system. We highlight the integration challenges and demonstrate that the proposed method achieves an average intersection over union and the true positive rate of >= 60% . This is the first study to provide practical guidance to researchers for fast prototyping of person retrieval with subsystem-level understanding and achieve SOTA performance.
更多查看译文
关键词
Person attribute recognition,detection,retrieval,soft biometrics,visual-textual problem
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要