Visible to the public Beyond Nodes and Edges: Multiresolution Algorithms for Network Data

TitleBeyond Nodes and Edges: Multiresolution Algorithms for Network Data
Publication TypeConference Paper
Year of Publication2016
AuthorsLeskovec, Jure
Conference NameProceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4513-2
KeywordsHuman Behavior, Metrics, pubcrawl, random key generation, Resiliency, Scalability
Abstract

Networks are a fundamental tool for understanding and modeling complex systems in physics, biology, neuroscience, engineering, and social science. Many networks are known to exhibit rich, lower-order connectivity patterns that can be captured at the level of individual nodes and edges. However, higher-order organization of complex networks - at the level of small network subgraphs - remains largely unknown. Here, we develop a generalized framework for clustering networks on the basis of higher-order connectivity patterns. This framework provides mathematical guarantees on the optimality of obtained clusters and scales to networks with billions of edges. The framework reveals higher-order organization in a number of networks, including information propagation units in neuronal networks and hub structure in transportation networks. Results show that networks exhibit rich higher-order organizational structures that are exposed by clustering based on higher-order connectivity patterns. Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.

URLhttp://doi.acm.org/10.1145/2980523.2980525
DOI10.1145/2980523.2980525
Citation Keyleskovec_beyond_2016