Weakly-Supervised Video Object Grounding by Exploring Spatio-Temporal Contexts

MM '20: The 28th ACM International Conference on Multimedia Seattle WA USA October, 2020, pp. 1939-1947, 2020.

Cited by: 0|Bibtex|Views8|Links
EI

Abstract:

Grounding objects in visual context from natural language queries is a crucial yet challenging vision-and-language task, which has gained increasing attention in recent years. Existing work has primarily investigated this task in the context of still images. Despite their effectiveness, these methods cannot be directly migrated into the v...More

Code:

Data:

Your rating :
0

 

Tags
Comments