Toward a 3D body part detection video dataset and hand tracking benchmark

PETRA '13: Proceedings of the 6th International Conference on PErvasive Technologies Related to Assistive Environments(2013)

引用 13|浏览0
暂无评分
摘要
The purpose of this paper is twofold. First, we introduce our Microsoft Kinect--based video dataset of American Sign Language (ASL) signs designed for body part detection and tracking research. This dataset allows researchers to experiment with using more than 2-dimensional (2D) color video information in gesture recognition projects, as it gives them access to scene depth information. Not only can this make it easier to locate body parts like hands, but without this additional information, two completely different gestures that share a similar 2D trajectory projection can be difficult to distinguish from one another. Second, as an accurate hand locator is a critical element in any automated gesture or sign language recognition tool, this paper assesses the efficacy of one popular open source user skeleton tracker by examining its performance on random signs from the above dataset. We compare the hand positions as determined by the skeleton tracker to ground truth positions, which come from manual hand annotations of each video frame. The purpose of this study is to establish a benchmark for the assessment of more advanced detection and tracking methods that utilize scene depth data. For illustrative purposes, we compare the results of one of the methods previously developed in our lab for detecting a single hand to this benchmark.
更多
查看译文
关键词
color video information,hand position,single hand,manual hand annotation,body part detection video,video dataset,hand tracking benchmark,additional information,scene depth information,accurate hand locator,skeleton tracker,video frame,gesture recognition,tracking,kinect
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要