Local Bayes Risk Minimization Based Stopping Strategy for Hierarchical Classification

2017 IEEE International Conference on Data Mining (ICDM)(2017)

引用 26|浏览27
暂无评分
摘要
In large-scale data classification tasks, it is becoming more and more challenging in finding a true class from a huge amount of candidate categories. Fortunately, a hierarchical structure usually exists in these massive categories. The task of utilizing this structure for effective classification is called hierarchical classification. It usually follows a top-down fashion which predicts a sample from the root node with a coarse-grained category to a leaf node with a fine-grained category. However, misclassification is inevitable if the information is insufficient or large uncertainty exists in the prediction process. In this scenario, we can design a stopping strategy to stop the sample at an internal node with a coarser category, instead of predicting a wrong leaf node. Several studies address the problem by improving performance in terms of hierarchical accuracy and informative prediction. However, all of these researches ignore an important issue: when predicting a sample at the current node, the error is inclined to occur if large uncertainty exists in the next lower level children nodes. In this paper, we integrate this uncertainty into a risk problem: when predicting a sample at a decision node, it will take precipitance risk in predicting the sample to a children node in the next lower level on one hand, and take conservative risk in stopping at the current node on the other. We address the risk problem by designing a Local Bayes Risk Minimization (LBRM) framework, which divides the prediction process into recursively deciding to stop or to go down at each decision node by balancing these two risks in a top-down fashion. Rather than setting a global loss function in the traditional Bayes risk framework, we replace it with different uncertainty in the two risks for each decision node. The uncertainty on the precipitance risk and the conservative risk are measured by information entropy on children nodes and information gain from the current node to children nodes, respectively. We propose a Weighted Tree Induced Error (WTIE) to obtain the predictions of minimum risk with different emphasis on the two risks. Experimental results on various datasets show the effectiveness of the proposed LBRM algorithm.
更多
查看译文
关键词
Hierarchical Classification,Stopping Strategy,Local Bayes Risk Minimization,Uncertainty
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要