Multimedia Event Detection and Recounting

mag(2013)

引用 28|浏览87
暂无评分
摘要
We report on our system used in the TRECVID 2013 Multimedia Event Detection (MED) and Multimedia Event Recounting (MER) tasks. For MED, it consists of four main steps: extracting features, representing features, training detectors and fusion. In the feature extraction part, we extract more than 10 low-level, high-level, and text features. Those features are then represented in three different ways, which are spatial bag-of-words, Gaussian Mixture Model Super Vectors (GMM) and Fisher Vectors. In the detector training and fusion, two classifiers and weighted double fusion method are employed. The official evaluation results show that our MED full systems achieve the best scores on Ah-Hoc EK10 and EK0, our audio systems achieve the best scores in EK100 and EK10 for both Pre-specified and Ad-Hoc tasks. In this report, we will analyze the contribution of each component for MED and draw some insights for video analysis. Our MER system utilizes a subset of features and detection results from the MED system from which the recounting is generated.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要