Biblio
Community structure detection in social networks has become a big challenge. Various methods in the literature have been presented to solve this challenge. Recently, several methods have also been proposed to solve this challenge based on a mapping-reduction model, in which data and algorithms are divided between different process nodes so that the complexity of time and memory of community detection in large social networks is reduced. In this paper, a mapping-reduction model is first proposed to detect the structure of communities. Then the proposed framework is rewritten according to a new mechanism called distributed cache memory; distributed cache memory can store different values associated with different keys and, if necessary, put them at different computational nodes. Finally, the proposed rewritten framework has been implemented using SPARK tools and its implementation results have been reported on several major social networks. The performed experiments show the effectiveness of the proposed framework by varying the values of various parameters.
Community detection in complex networks is a fundamental problem that attracts much attention across various disciplines. Previous studies have been mostly focusing on external connections between nodes (i.e., topology structure) in the network whereas largely ignoring internal intricacies (i.e., local behavior) of each node. A pair of nodes without any interaction can still share similar internal behaviors. For example, in an enterprise information network, compromised computers controlled by the same intruder often demonstrate similar abnormal behaviors even if they do not connect with each other. In this paper, we study the problem of community detection in enterprise information networks, where large-scale internal events and external events coexist on each host. The discovered host communities, capturing behavioral affinity, can benefit many comparative analysis tasks such as host anomaly assessment. In particular, we propose a novel community detection framework to identify behavior-based host communities in enterprise information networks, purely based on large-scale heterogeneous event data. We continue proposing an efficient method for assessing host's anomaly level by leveraging the detected host communities. Experimental results on enterprise networks demonstrate the effectiveness of our model.
Multi-hop Wireless Mesh Networks (WMNs) is a promising new technique for communication with routing protocol designs being critical to the effective and efficient of these WMNs. A common approach for routing traffic in these networks is to select a minimal distance from source to destination as in wire-line networks. Opportunistic Routing(OR) makes use of the broadcasting ability of wireless network and is especially very helpful for WMN because all nodes are static. Our proposed scheme of Multicast Opportunistic Routing(MOR) in WMNs is based on the broadcast transmissions and Learning Au-tomata (LA) to expand the potential candidate nodes that can aid in the process of retransmission of the data. The receivers are required to be in sync with one another in order to avoid duplicated broadcasting of data which is generally achieved by formulating the forwarding candidates according to some LA based metric. The most adorable aspect of this protocol is that it intelligently "learns" from the past experience and improves its performance. The results obtained via this approach of MOR, shows that the proposed scheme outperforms with some existing sachems and is an improved and more effective version of opportunistic routing in mesh network.
In this paper, we address the problem of peer grouping employees in an organization for identifying security risks. Our motivation for studying peer grouping is its importance for a clear understanding of user and entity behavior analytics (UEBA) that is the primary tool for identifying insider threat through detecting anomalies in network traffic. We show that using Louvain method of community detection it is possible to automate peer group creation with feature-based weight assignments. Depending on the number of employees and their features we show that it is also possible to give each group a meaningful description. We present three new algorithms: one that allows an addition of new employees to already generated peer groups, another that allows for incorporating user feedback, and lastly one that provides the user with recommended nodes to be reassigned. We use Niara's data to validate our claims. The novelty of our method is its robustness, simplicity, scalability, and ease of deployment in a production environment.
In this work we put forward our novel approach using graph partitioning and Micro-Community detection techniques. We firstly use algebraic connectivity or Fiedler Eigenvector and spectral partitioning for community detection. We then used modularity maximization and micro level clustering for detecting micro-communities with concept of community energy. We run micro-community clustering algorithm recursively with modularity maximization which helps us identify dense, deeper and hidden community structures. We experimented our MicroCommunity Clustering (MCC) algorithm for various types of complex technological and social community networks such as directed weighted, directed unweighted, undirected weighted, undirected unweighted. A novel fact about this algorithm is that it is scalable in nature.
Complex networks usually expose community structure with groups of nodes sharing many links with the other nodes in the same group and relatively few with the nodes of the rest. This feature captures valuable information about the organization and even the evolution of the network. Over the last decade, a great number of algorithms for community detection have been proposed to deal with the increasingly complex networks. However, the problem of doing this in a private manner is rarely considered. In this paper, we solve this problem under differential privacy, a prominent privacy concept for releasing private data. We analyze the major challenges behind the problem and propose several schemes to tackle them from two perspectives: input perturbation and algorithm perturbation. We choose Louvain method as the back-end community detection for input perturbation schemes and propose the method LouvainDP which runs Louvain algorithm on a noisy super-graph. For algorithm perturbation, we design ModDivisive using exponential mechanism with the modularity as the score. We have thoroughly evaluated our techniques on real graphs of different sizes and verified that ModDivisive steadily gives the best modularity and avg.F1Score on large graphs while LouvainDP outperforms the remaining input perturbation competitors in certain settings.
Centrality measures have perpetually been helpful to find the foremost central or most powerful node within the network. There are numerous strategies to compute centrality of a node however in social networks betweenness centrality is the most widely used approach to bifurcate communities within the network, to find out the susceptibility within the complex networks and to generate the scale free networks whose degree distribution follows the power law. In this paper, we've computed betweenness centrality by identifying communities lying within the network. Our algorithm efficiently updates the centrality of the nodes whenever any edge or vertex addition or deletion takes place within the dynamic network by modifying solely a subset of vertices. For the vertex addition, Incremental Algorithm has been used in which Streaming graphs has also been considered. Brandes approach is the most widely used approach for finding out the betweenness centrality however it's still expensive for growing networks since it takes O(mn+n2logn) amount of time and O(n+m) space however our approach efficiently updates the centrality of the nodes by taking O(textbarStextbarn+textbarStextbarnlogn) amount of time where textbarStextbar is the subset of the vertices,m is the number of edges, n is the number of vertices and textbarStextbar≤n holds true.
Understanding the behavior of complex financial supply chains is usually difficult due to a lack of data capturing the interactions between financial institutions (FIs) and the roles that they play in financial contracts (FCs). resMBS is an example supply chain corresponding to the US residential mortgage backed securities that were critical in the 2008 US financial crisis. In this paper, we describe the process of creating the resMBS graph dataset from financial prospectus. We use the SystemT rule-based text extraction platform to develop two tools, ORG NER and Dict NER, for named entity recognition of financial institution (FI) names. The resMBS graph comprises a set of FC nodes (each prospectus) and the corresponding FI nodes that are extracted from the prospectus. A Role-FI extractor matches a role keyword such as originator, sponsor or servicer, with FI names. We study the performance of the Role-FI extractor, and ORG NER and Dict NER, in constructing the resMBS dataset. We also present preliminary results of a clustering based analysis to identify financial communities and their evolution in the resMBS financial supply chain.
The popularity of Android OS has dramatically increased malware apps targeting this mobile OS. The daily amount of malware has overwhelmed the detection process. This fact has motivated the need for developing malware detection and family attribution solutions with the least manual intervention. In response, we propose Cypider framework, a set of techniques and tools aiming to perform a systematic detection of mobile malware by building an efficient and scalable similarity network infrastructure of malicious apps. Our detection method is based on a novel concept, namely malicious community, in which we consider, for a given family, the instances that share common features. Under this concept, we assume that multiple similar Android apps with different authors are most likely to be malicious. Cypider leverages this assumption for the detection of variants of known malware families and zero-day malware. It is important to mention that Cypider does not rely on signature-based or learning-based patterns. Alternatively, it applies community detection algorithms on the similarity network, which extracts sub-graphs considered as suspicious and most likely malicious communities. Furthermore, we propose a novel fingerprinting technique, namely community fingerprint, based on a learning model for each malicious community. Cypider shows excellent results by detecting about 50% of the malware dataset in one detection iteration. Besides, the preliminary results of the community fingerprint are promising as we achieved 87% of the detection.
Traditional Anti-virus technology is primarily based on static analysis and dynamic monitoring. However, both technologies are heavily depended on application files, which increase the risk of being attacked, wasting of time and network bandwidth. In this study, we propose a new graph-based method, through which we can preliminary detect malicious URL without application file. First, the relationship between URLs can be found through the relationship between people and URLs. Then the association rules can be mined with confidence of each frequent URLs. Secondly, the networks of URLs was built through the association rules. When the networks of URLs were finished, we clustered the date with modularity to detect communities and every community represents different types of URLs. We suppose that a URL has association with one community, then the URL is malicious probably. In our experiments, we successfully captured 82 % of malicious samples, getting a higher capture than using traditional methods.
Techniques for network security analysis have historically focused on the actions of the network hosts. Outside of forensic analysis, little has been done to detect or predict malicious or infected nodes strictly based on their association with other known malicious nodes. This methodology is highly prevalent in the graph analytics world, however, and is referred to as community detection. In this paper, we present a method for detecting malicious and infected nodes on both monitored networks and the external Internet. We leverage prior community detection and graphical modeling work by propagating threat probabilities across network nodes, given an initial set of known malicious nodes. We enhance prior work by employing constraints that remove the adverse effect of cyclic propagation that is a byproduct of current methods. We demonstrate the effectiveness of probabilistic threat propagation on the tasks of detecting botnets and malicious web destinations.