HC3: A Suite of Test Collections for CLIR Evaluation over Informal Text

PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023(2023)

引用 2|浏览35
暂无评分
摘要
While there are many test collections for Cross-Language Information Retrieval (CLIR), none of the large public test collections focus on short informal text documents. This paper introduces a new pair of CLIR test collections with millions of Chinese or Persian Tweets or Tweet threads as documents, sixty event-motivated topics written both in English and in each of the two document languages, and three-point graded relevance judgments constructed using interactive search and active learning. The design and construction of these new test collections are described, and baseline results are presented that demonstrate the utility of the collections for system evaluation. Shallow pooling is used to assess the efficacy of active learning to select documents for judgment.
更多
查看译文
关键词
Test Collection,Cross-Language Information Retrieval,CLIR,Evaluation,Tweet-based documents
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要