Biblio
The ability to discover patterns of interest in criminal networks can support and ease the investigation tasks by security and law enforcement agencies. By considering criminal networks as a special case of social networks, we can properly reuse most of the state-of-the-art techniques to discover patterns of interests, i.e., hidden and potential links. Nevertheless, in time-sensible scenarios, like the one involving criminal actions, the ability to discover patterns in a (near) real-time manner can be of primary importance.In this paper, we investigate the identification of patterns for link detection and prediction on an evolving criminal network. To extract valuable information as soon as data is generated, we exploit a stream processing approach. To this end, we also propose three new similarity social network metrics, specifically tailored for criminal link detection and prediction. Then, we develop a flexible data stream processing application relying on the Apache Flink framework; this solution allows us to deploy and evaluate the newly proposed metrics as well as the ones existing in literature. The experimental results show that the new metrics we propose can reach up to 83% accuracy in detection and 82% accuracy in prediction, resulting competitive with the state of the art metrics.
Botnets are considered one of the most dangerous species of network-based attack today because they involve the use of very large coordinated groups of hosts simultaneously. The behavioral analysis of computer networks is at the basis of the modern botnet detection methods, in order to intercept traffic generated by malwares for which signatures do not exist yet. Defining a pattern of features to be placed at the basis of behavioral analysis, puts the emphasis on the quantity and quality of information to be caught and used to mark data streams as normal or abnormal. The problem is even more evident if we consider extensive computer networks or clouds. With the present paper we intend to show how heuristics applied to large-scale proxy logs, considering a typical phase of the life cycle of botnets such as the search for C&C Servers through AGDs (Algorithmically Generated Domains), may provide effective and extremely rapid results. The present work will introduce some novel paradigms. The first is that some of the elements of the supply chain of botnets could be completed without any interaction with the Internet, mostly in presence of wide computer networks and/or clouds. The second is that behind a large number of workstations there are usually "human beings" and it is unlikely that their behaviors will cause marked changes in the interaction with the Internet in a fairly narrow time frame. Finally, AGDs can highlight, at the moment, common lexical features, detectable quickly and without using any black/white list.