Video Turing Test: A first step towards human-level AI

AI MAGAZINE(2023)

引用 0|浏览0
暂无评分
摘要
The development of artificial intelligence (AI) agents capable of human-level understanding of video content and conducting conversations with humans on this basis is a promising application that people expect. However, this is a challenging task that requires the holistic integration of multimodal information with temporal dependencies and reasoning, as well as social and physical commonsense. In addition, the development of appropriate systematic evaluation methods is essential. In this context, we introduce the Video Turing Test (VTT), a blind test used to evaluate human-likeness in terms of video comprehension ability. Moreover, we propose Vincent as a video understanding AI. We explain the configuration of VTT, the architecture of Vincent to prepare for VTT and the proposed evaluation methods for video comprehension. We also estimate the current intelligence level of AI based on our results and discuss future research directions.
更多
查看译文
关键词
ai,video,test,human‐level
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要