Fast Data Acquisition in Cost-Sensitive Learning.

ICDM'11: Proceedings of the 11th international conference on Advances in data mining: applications and theoretical aspects(2011)

引用 4|浏览347
暂无评分
摘要
Data acquisition is the first and one of the most important steps in many data mining applications. It is a time consuming and costly task. Acquiring an insufficient number of examples makes the learned model and future prediction inaccurate, while acquiring more examples than necessary wastes time and money. Thus it is very important to estimate the number examples needed for learning algorithms in machine learning. However, most previous learning algorithms learn from a given and fixed set of examples. To our knowledge, little previous work in machine learning can dynamically acquire examples as it learns, and decide the ideal number of examples needed. In this paper, we propose a simple on-line framework for fast data acquisition (FDA). FDA is an extrapolation method that estimates the number of examples needed in each acquisition and acquire them simultaneously. Comparing to the naïve step-by-step data acquisition strategy, FDA reduces significantly the number of times of data acquisition and model building. This would significantly reduce the total cost of misclassification, data acquisition arrangement, computation, and examples acquired costs.
更多
查看译文
关键词
data acquisition,machine learning,data acquisition arrangement,data mining application,fast data acquisition,step-by-step data acquisition strategy,ideal number,insufficient number,number example,previous learning algorithm,cost-sensitive learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要