Knowledge Discovery in Practice
Tourism Management(1999)
摘要
Data mining or Knowledge Discovery in Databases (KDD) is an exploratory and iterative process that can be decomposed into a number of stages. We describe the different activities in the data mining process and discuss some pitfalls and guidelines to circumvent them. Despite the predominant attention for analysis, data selection and pre-processing is usually the most time-consuming activity, and has a substantial influence on the ultimate success of the process. The involvement of a subject area expert, data mining expert as well as a data expert is critical to the success of data mining projects. Despite the attractive suggestion of "fully automatic" data analysis, knowledge of the processes behind the data remains indispensible in avoiding the many pitfalls of data mining. Although company databases are usually quite large, proper formulation of the data mining problem combined with sampling techniques often allows reduction to manageable sized data sets. In the majority of applications the data were originally not collected with the intention of data mining, but merely to support daily business processes. This may give rise to low quality data, as well as biases in the data that reduce the applicability of discovered patterns.
更多查看译文
关键词
data analysis,data mining,sampling technique,business process
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络