Visible to the public Maintaining Densest Subsets Efficiently in Evolving Hypergraphs

TitleMaintaining Densest Subsets Efficiently in Evolving Hypergraphs
Publication TypeConference Paper
Year of Publication2017
AuthorsHu, Shuguang, Wu, Xiaowei, Chan, T-H. Hubert
Conference NameProceedings of the 2017 ACM on Conference on Information and Knowledge Management
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4918-5
Keywordsdata deletion, densest subgraph, dynamic data structure, graph mining, privacy, pubcrawl, Scalability
Abstract

In this paper we study the densest subgraph problem, which plays a key role in many graph mining applications. The goal of the problem is to find a subset of nodes that induces a graph with maximum average degree. The problem has been extensively studied in the past few decades under a variety of different settings. Several exact and approximation algorithms were proposed. However, as normal graph can only model objects with pairwise relationships, the densest subgraph problem fails in identifying communities under relationships that involve more than 2 objects, e.g., in a network connecting authors by publications. We consider in this work the densest subgraph problem in hypergraphs, which generalizes the problem to a wider class of networks in which edges might have different cardinalities and contain more than 2 nodes. We present two exact algorithms and a near-linear time r-approximation algorithm for the problem, where r is the maximum cardinality of an edge in the hypergraph. We also consider the dynamic version of the problem, in which an adversary can insert or delete an edge from the hypergraph in each round and the goal is to maintain efficiently an approximation of the densest subgraph. We present two dynamic approximation algorithms in this paper with amortized polog update time, for any e \textbackslashtextgreater 0. For the case when there are only insertions, the approximation ratio we maintain is r(1+e), while for the fully dynamic case, the ratio is r2(1+e). Extensive experiments are performed on large real datasets to validate the effectiveness and efficiency of our algorithms.

URLhttp://doi.acm.org/10.1145/3132847.3132907
DOI10.1145/3132847.3132907
Citation Keyhu_maintaining_2017