Synchronized Colored Petri Net Based Multimodal Modeling and Real-Time Recognition of Conversational Spatial Deictic Gestures

Lecture notes in networks and systems(2023)

引用 0|浏览0
暂无评分
摘要
Gestures are an important part of intelligent human-robot interactions. Co-speech gestures are a subclass of gestures that integrate speech and dialogs with synchronous combinations of various postures, haptics (touch), and motions such as head, hand, index finger or palm, and gaze. Deictic gestures are a subclass of co-speech gestures that provide Spatio-temporal reference to entities in the field-of-vision, by pointing at an individual entity or collection of entities and referring to them using pronouns in spoken phrases. Deictic gestures are important for human-robot interaction due to their property of attention seeking and providing a common frame-of-reference by object localization. In this research, we identify different subclasses of deictic gestures and extend the Synchronized Colored Petri net (SCP) model to recognize deictic gestures. The proposed extension integrates synchronized motions of head, hand, index-finger, palm, gaze (eye-motion tracking and focus) with pronoun reference in speech. An implementation using video-frame analysis and gesture-signatures representing meta-level attributes of SCP has been described. An algorithm has been presented. Performance analysis shows that the recall is approximately 85 percent for deictic gestures, and conversational head-gestures are separated from deictic gestures 95 percent of the time. Results show that mislabeling in deictic gestures occurs due to missing frames, feature points, undetectable motions, and the choice of thresholds during video analysis.
更多
查看译文
关键词
multimodal,recognition,real-time
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要