Matrix Profile XVI: Efficient and Effective Labeling of Massive Time Series Archives

2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA)(2019)

引用 6|浏览65
暂无评分
摘要
In domains as diverse as entomology and sports medicine, analysts are routinely required to label large amounts of time series data. In a few rare cases, this can be done automatically with a classification algorithm. In many domains however, complex, noisy, and polymorphic data can defeat state-of-the-art classifiers, yet easily yield to human inspection and annotation. This is especially true if the human can access auxiliary information and previous annotations. This labeling task can be a significant bottleneck in scientific progress. For example, an entomology or sports physiology lab may produce several days worth of time series each day. In this work, we introduce an algorithm that greatly reduces the human effort required. Our interactive algorithm groups subsequences and invites the user to label a group's prototype, brushing the label to all members of the group. Thus, our task reduces to optimizing the grouping(s), to allow our system to ask the fewest questions of the user. As we shall show, on diverse domains, we can reduce the human effort by at least an order of magnitude, with no decrease in accuracy.
更多
查看译文
关键词
Time Series, Segmentation, Labeling, Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要