On the effectiveness of the skew divergence for statistical language analysis.
AISTATS(2001)
摘要
Estimating word co-occurrence probabili- ties is a problem underlying many appli- cations in statistical natural language pro- cessing. Distance-weighted (or similarity- weighted) averaging has been shown to be a promising approach to the analysis of novel co-occurrences. Many measures of distri- butional similarity have been proposed for use in the distance-weighted averaging frame- work; here, we empirically study their stabil- ity properties, nding that similarity-based estimation appears to make more ecient use of more reliable portions of the training data. We also investigate properties of the skew di- vergence, a weighted version of the Kullback- Leibler (KL) divergence; our results indicate that the skew divergence yields better results than the KL divergence even when the KL divergence is applied to more sophisticated probability estimates.
更多查看译文
关键词
natural language,kullback leibler,empirical study
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络