Discovering Communities and Anomalies in Attributed Graphs: Interactive Visual Exploration and Summarization.

Bryan Perozzi,Leman Akoglu

TKDD(2018)

引用 46|浏览88
暂无评分
摘要
Given a network with node attributes, how can we identify communities and spot anomalies? How can we characterize, describe, or summarize the network in a succinct way? Community extraction requires a measure of quality for connected subgraphs (e.g., social circles). Existing subgraph measures, however, either consider only the connectedness of nodes inside the community and ignore the cross-edges at the boundary (e.g., density) or only quantify the structure of the community and ignore the node attributes (e.g., conductance). In this work, we focus on node-attributed networks and introduce: (1) a new measure of subgraph quality for attributed communities called normality, (2) a community extraction algorithm that uses normality to extract communities and a few characterizing attributes per community, and (3) a summarization and interactive visualization approach for attributed graph exploration. More specifically, (1) we first introduce a new measure to quantify the normality of an attributed subgraph. Our normality measure carefully utilizes structure and attributes together to quantify both the internal consistency and external separability. We then formulate an objective function to automatically infer a few attributes (called the “focus”) and respective attribute weights, so as to maximize the normality  score of a given subgraph. Most notably, unlike many other approaches, our measure allows for many cross-edges as long as they can be “exonerated;” i.e., either (i) are expected under a null graph model, and/or (ii) their boundary nodes do not exhibit the focus attributes. Next, (2) we propose AMEN (for Attributed Mining of Entity Networks), an algorithm that simultaneously discovers the communities and their respective focus in a given graph, with a goal to maximize the total normality. Communities for which a focus that yields high normality  cannot be found are considered low quality or anomalous. Last, (3) we formulate a summarization task with a multi-criteria objective, which selects a subset of the communities that (i) cover the entire graph well, are (ii) high quality and (iii) diverse in their focus attributes. We further design an interactive visualization interface that presents the communities to a user in an interpretable, user-friendly fashion. The user can explore all the communities, analyze various algorithm-generated summaries, as well as devise their own summaries interactively to characterize the network in a succinct way. As the experiments on real-world attributed graphs show, our proposed approaches effectively find anomalous communities and outperform several existing measures and methods, such as conductance, density, OddBall, and SODA. We also conduct extensive user studies to measure the capability and efficiency that our approach provides to the users toward network summarization, exploration, and sensemaking.
更多
查看译文
关键词
Attributed graphs, anomaly mining, community extraction, ego networks, human-in-the-loop, interaction design, network measures, social circles, summarization, visual analytics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要