Dense Captioning of Natural Scenes in Spanish.

Alejandro Gomez-Garay,Bogdan Raducanu,Joaquín Salas

PATTERN RECOGNITION(2018)

引用 2|浏览62
暂无评分
摘要
The inclusion of visually impaired people to daily life is a challenging and active area of research. This work studies how to bring information about the surroundings to people delivered as verbal descriptions in Spanish using wearable devices. We use a neural network (Dense-Cap) for both identifying objects and generating phrases about them. DenseCap is running on a server to describe an image fed from a smart-phone application, and its output is the text which a smartphone verbalizes. Our implementation achieves a mean Average Precision (mAP) of 5.0 in object recognition and quality of captions and takes an average of 7.5 s from the moment one grabs a picture until one receives the verbalization in Spanish.
更多
查看译文
关键词
Computer vision,Deep learning,Image captioning,Spanish language
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要