Classification of streaming time series under more realistic assumptions

Data Mining and Knowledge Discovery(2015)

引用 30|浏览94
暂无评分
摘要
Much of the vast literature on time series classification makes several assumptions about data and the algorithm’s eventual deployment that are almost certainly unwarranted. For example, many research efforts assume that the beginning and ending points of the pattern of interest can be correctly identified, during both the training phase and later deployment. Another example is the common assumption that queries will be made at a constant rate that is known ahead of time, thus computational resources can be exactly budgeted. In this work, we argue that these assumptions are unjustified, and this has in many cases led to unwarranted optimism about the performance of the proposed algorithms. As we shall show, the task of correctly extracting individual gait cycles, heartbeats, gestures, behaviors, etc., is generally much more difficult than the task of actually classifying those patterns. Likewise, gesture classification systems deployed on a device such as Google Glass may issue queries at frequencies that range over an order of magnitude, making it difficult to plan computational resources. We propose to mitigate these problems by introducing an alignment-free time series classification framework. The framework requires only very weakly annotated data, such as “in this ten minutes of data, we see mostly normal heartbeats … ,” and by generalizing the classic machine learning idea of data editing to streaming/continuous data, allows us to build robust, fast and accurate anytime classifiers. We demonstrate on several diverse real-world problems that beyond removing unwarranted assumptions and requiring essentially no human intervention, our framework is both extremely fast and significantly more accurate than current state-of-the-art approaches.
更多
查看译文
关键词
Time series classification,Data dictionary,Anytime algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要