On Finding Frequent Patterns in Directed Acyclic Graphs
Clinical Orthopaedics and Related Research(2010)
摘要
Given a directed acyclic graph with labeled vertices, we consider the problem
of finding the most common label sequences ("traces") among all paths in the
graph (of some maximum length m). Since the number of paths can be huge, we
propose novel algorithms whose time complexity depends only on the size of the
graph, and on the relative frequency epsilon of the most frequent traces. In
addition, we apply techniques from streaming algorithms to achieve space usage
that depends only on epsilon, and not on the number of distinct traces. The
abstract problem considered models a variety of tasks concerning finding
frequent patterns in event sequences. Our motivation comes from working with a
data set of 2 million RFID readings from baggage trolleys at Copenhagen
Airport. The question of finding frequent passenger movement patterns is mapped
to the above problem. We report on experimental findings for this data set.
更多查看译文
关键词
time complexity,streaming algorithm,data structure,directed acyclic graph
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要