Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition
arxiv(2024)
摘要
Open-domain real-world entity recognition is essential yet challenging,
involving identifying various entities in diverse environments. The lack of a
suitable evaluation dataset has been a major obstacle in this field due to the
vast number of entities and the extensive human effort required for data
curation. We introduce Entity6K, a comprehensive dataset for real-world entity
recognition, featuring 5,700 entities across 26 categories, each supported by 5
human-verified images with annotations. Entity6K offers a diverse range of
entity names and categorizations, addressing a gap in existing datasets. We
conducted benchmarks with existing models on tasks like image captioning,
object detection, zero-shot classification, and dense captioning to demonstrate
Entity6K's effectiveness in evaluating models' entity recognition capabilities.
We believe Entity6K will be a valuable resource for advancing accurate entity
recognition in open-domain settings.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要