On the Theory of Weak Supervision for Information Retrieval.

ICTIR(2018)

引用 39|浏览106
暂无评分
摘要
Neural network approaches have recently shown to be effective in several information retrieval (IR) tasks. However, neural approaches often require large volumes of training data to perform effectively, which is not always available. To mitigate the shortage of labeled data, training neural IR models with supervision has been recently proposed and received considerable attention in the literature. In supervision, an existing model automatically generates labels for a large set of unlabeled data, and a machine learning model is further trained on the generated weak data. Surprisingly, it has been shown in prior art that the trained neural model can outperform the labeler by a significant margin. Although these obtained improvements have been intuitively justified in previous work, the literature still lacks theoretical justification for the observed empirical findings. In this paper, we provide a theoretical insight into supervision for information retrieval, focusing on learning to rank. We model the supervision signal as a noisy channel that introduces noise to the correct ranking. Based on the risk minimization framework, we prove that given some sufficient constraints on the loss function, supervision is equivalent to supervised learning under uniform noise. We also find an upper bound for the empirical risk of supervision in case of non-uniform noise. Following the recent work on using multiple supervision signals to learn more accurate models, we find an information theoretic lower bound on the number of supervision signals required to guarantee an upper bound for the pairwise error probability. We empirically verify a set of presented theoretical findings, using synthetic and real supervision data.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要