Visible to the public Fast Mining of Large-Scale Logs for Botnet Detection: A Field Study

TitleFast Mining of Large-Scale Logs for Botnet Detection: A Field Study
Publication TypeConference Paper
Year of Publication2015
AuthorsBottazzi, G., Italiano, G. F.
Conference Name2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing
Date Publishedoct
Keywordsabnormal data streams, AGD, algorithmically generated domains, Botnet, botnet detection methods, botnet life cycle, C and C Servers, cloud computing, computer network behavioral analysis, computer network security, data mining, digital signatures, feature extraction, feature pattern, heuristics, information quality, information quantity, Internet, invasive software, large-scale proxy log mining, logs, Malware, malwares, mining, network-based attack, normal data streams, proxy, pubcrawl170112, Servers, Workstations
Abstract

Botnets are considered one of the most dangerous species of network-based attack today because they involve the use of very large coordinated groups of hosts simultaneously. The behavioral analysis of computer networks is at the basis of the modern botnet detection methods, in order to intercept traffic generated by malwares for which signatures do not exist yet. Defining a pattern of features to be placed at the basis of behavioral analysis, puts the emphasis on the quantity and quality of information to be caught and used to mark data streams as normal or abnormal. The problem is even more evident if we consider extensive computer networks or clouds. With the present paper we intend to show how heuristics applied to large-scale proxy logs, considering a typical phase of the life cycle of botnets such as the search for C&C Servers through AGDs (Algorithmically Generated Domains), may provide effective and extremely rapid results. The present work will introduce some novel paradigms. The first is that some of the elements of the supply chain of botnets could be completed without any interaction with the Internet, mostly in presence of wide computer networks and/or clouds. The second is that behind a large number of workstations there are usually "human beings" and it is unlikely that their behaviors will cause marked changes in the interaction with the Internet in a fairly narrow time frame. Finally, AGDs can highlight, at the moment, common lexical features, detectable quickly and without using any black/white list.

URLhttps://ieeexplore.ieee.org/document/7363341
DOI10.1109/CIT/IUCC/DASC/PICOM.2015.295
Citation Keybottazzi_fast_2015