PABI: A Unified PAC-Bayesian Informativeness Measure for Incidental Supervision Signals

user-5fe1a78c4c775e6ec07359f9(2021)

引用 0|浏览18
暂无评分
摘要
Real-world applications often require making use of {\em a range of incidental supervision signals}. However, we currently lack a principled way to measure the benefit an incidental training dataset can bring, and the common practice of using indirect, weaker signals is through exhaustive experiments with various models and hyper-parameters. This paper studies whether we can, {\em in a single framework, quantify the benefit of various types of incidental signals for one's target task without going through combinatorial experiments}. We propose PABI, a unified informativeness measure motivated by PAC-Bayesian theory, characterizing the reduction in uncertainty that indirect, weak signals provide. We demonstrate PABI's use in quantifying various types of incidental signals including partial labels, noisy labels, constraints, cross-domain signals, and combinations of these. Experiments with various setups on two natural language processing (NLP) tasks, named entity recognition (NER) and question answering (QA), show that PABI correlates well with learning performance, providing a promising way to determine, ahead of learning, which supervision signals would be beneficial.
更多
查看译文
关键词
Named-entity recognition,Question answering,Natural language processing,Bayesian probability,Computer science,Artificial intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要