Mixed-dish Recognition with Contextual Relation Networks

Proceedings of the 27th ACM International Conference on Multimedia(2019)

引用 39|浏览208
暂无评分
摘要
Mixed dish is a food category that contains different dishes mixed in one plate, and is popular in Eastern and Southeast Asia. Recognizing individual dishes in a mixed dish image is important for health related applications, e.g. calculating the nutrition values. However, most existing methods that focus on single dish classification are not applicable to mixed-dish recognition. The new challenge in recognizing mixed-dish images are the complex ingredient combination and severe overlap among different dishes. In order to tackle these problems, we propose a novel approach called contextual relation networks (CR-Nets) that encodes the implicit and explicit contextual relations among multiple dishes using region-level features and label-level co-occurrence, respectively. This is inspired by the intuition that people are likely to choose dishes with common eating habits, e.g., with multiple nutrition but without repeating ingredients. In addition, we collect a large-scale dataset of mixed-dish images that contain $9,254$ mixed-dish images from $6$ school canteens in Singapore. Extensive experiments on both our dataset and a smaller-scale public dataset validate that our CR-Nets can achieve top performance for localizing the dishes and recognizing their food categories.
更多
查看译文
关键词
context modeling, food recognition, multiple dish detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要