Streaming Temporal Graphs: Subgraph Matching

2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)(2019)

引用 0|浏览108
暂无评分
摘要
We investigate solutions to subgraph matching within a temporal stream of data. We present a high-level language for describing temporal subgraphs of interest, the Streaming Analytics Language (SAL). SAL programs are translated into C++ code that is run in parallel on a cluster. We call this implementation of SAL the Streaming Analytics Machine (SAM). SAL programs are succinct, requiring about 20 times fewer lines of code than using the SAM library directly, or writing an implementation using Apache Flink. To benchmark SAM we calculate finding temporal triangles within streaming netflow data. Also, we compare SAM to an implementation written for Flink. We find that SAM is able to scale to 128 nodes or 2560 cores, while Apache Flink has max throughput with 32 nodes and degrades thereafter. Apache Flink has an advantage when triangles are rare, with max aggregate throughput 14 Flink at 32 nodes greater than the max achievable rate of SAM. In our experiments, when triangle occurrence was faster than five per second per node, SAM performed better. Both frameworks may miss results due to latencies in network communication. SAM consistently reported an average of 93.7% of expected results while Mink decreases from 83.7% to 52.1% as we increase to the maximum size of the cluster. Overall, SAM can obtain rates of 91.8 billion netfiows per day.
更多
查看译文
关键词
temporal graphs,subgraph matching,high-level language,SAL programs,SAM library,Apache Flink,benchmark SAM,temporal triangles,streaming analytics machine,streaming analytics language,C++ code
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要