Knowledge Graph Extraction from Videos

2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)(2020)

引用 6|浏览77
暂无评分
摘要
Nearly all existing techniques for automated video annotation (or captioning) describe videos using natural language sentences. However, this has several shortcomings: (i) it is very hard to then further use the generated natural language annotations in automated data processing, (ii) generating natural language annotations requires solving the hard subtask of generating semantically precise and syntactically correct natural language sentences, which is actually unrelated to the task of video annotation, (iii) it is difficult to quantitatively measure performance, as standard metrics (e.g., accuracy and F1-score) are inapplicable, and (iv) annotations are language-specific. In this paper, we propose the new task of knowledge graph extraction from videos, i.e., producing a description in the form of a knowledge graph of the contents of a given video. Since no datasets exist for this task, we also include a method to automatically generate them, starting from datasets where videos are annotated with natural language. We then describe an initial deep-learning model for knowledge graph extraction from videos, and report results on MSVD* and MSR-VTT*, two datasets obtained from MSVD and MSR-VTT using our method.
更多
查看译文
关键词
Knowledge Graph,Deep Learning,Video Understanding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要