Multimodal Egocentric Analysis of Focused Interactions.

IEEE ACCESS(2018)

引用 13|浏览40
暂无评分
摘要
Continuous detection of social interactions from wearable sensor data streams has a range of potential applications in domains, including health and social care, security, and assistive technology. We contribute an annotated, multimodal data set capturing such interactions using video, audio, GPS, and inertial sensing. We present methods for automatic detection and temporal segmentation of focused interactions using support vector machines and recurrent neural networks with features extracted from both audio and video streams. The focused interaction occurs when the co-present individuals, having the mutual focus of attention, interact by first establishing the face-to-face engagement and direct conversation. We describe an evaluation protocol, including framewise, extended framewise, and event-based measures, and provide empirical evidence that the fusion of visual face track scores with audio voice activity scores provides an effective combination. The methods, contributed data set, and protocol together provide a benchmark for the future research on this problem. The data set is https://doi.org/10.15132/10000134.
更多
查看译文
关键词
Social interaction,egocentric sensing,multimodal analysis,temporal segmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要