Toward a 3D body part detection video dataset and hand tracking benchmark

Christopher Conly,Paul Doliotis,Pat Jangyodsuk,Rommel Alonzo,Vassilis Athitsos

PETRA '13: Proceedings of the 6th International Conference on PErvasive Technologies Related to Assistive Environments（2013）

引用 13|浏览0

暂无评分

摘要

The purpose of this paper is twofold. First, we introduce our Microsoft Kinect--based video dataset of American Sign Language (ASL) signs designed for body part detection and tracking research. This dataset allows researchers to experiment with using more than 2-dimensional (2D) color video information in gesture recognition projects, as it gives them access to scene depth information. Not only can this make it easier to locate body parts like hands, but without this additional information, two completely different gestures that share a similar 2D trajectory projection can be difficult to distinguish from one another. Second, as an accurate hand locator is a critical element in any automated gesture or sign language recognition tool, this paper assesses the efficacy of one popular open source user skeleton tracker by examining its performance on random signs from the above dataset. We compare the hand positions as determined by the skeleton tracker to ground truth positions, which come from manual hand annotations of each video frame. The purpose of this study is to establish a benchmark for the assessment of more advanced detection and tracking methods that utilize scene depth data. For illustrative purposes, we compare the results of one of the methods previously developed in our lab for detecting a single hand to this benchmark.

查看译文

关键词

color video information,hand position,single hand,manual hand annotation,body part detection video,video dataset,hand tracking benchmark,additional information,scene depth information,accurate hand locator,skeleton tracker,video frame,gesture recognition,tracking,kinect

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要