Biblio
Cloud computing is gaining ground and becoming one of the fast growing segments of the IT industry. However, if its numerous advantages are mainly used to support a legitimate activity, it is now exploited for a use it was not meant for: malicious users leverage its power and fast provisioning to turn it into an attack support. Botnets supporting DDoS attacks are among the greatest beneficiaries of this malicious use since they can be setup on demand and at very large scale without requiring a long dissemination phase nor an expensive deployment costs. For cloud service providers, preventing their infrastructure from being turned into an Attack as a Service delivery model is very challenging since it requires detecting threats at the source, in a highly dynamic and heterogeneous environment. In this paper, we present the result of an experiment campaign we performed in order to understand the operational behavior of a botcloud used for a DDoS attack. The originality of our work resides in the consideration of system metrics that, while never considered for state-of-the-art botnets detection, can be leveraged in the context of a cloud to enable a source based detection. Our study considers both attacks based on TCP-flood and UDP-storm and for each of them, we provide statistical results based on a principal component analysis, that highlight the recognizable behavior of a botcloud as compared to other legitimate workloads.
Malware researchers rely on the observation of malicious code in execution to collect datasets for a wide array of experiments, including generation of detection models, study of longitudinal behavior, and validation of prior research. For such research to reflect prudent science, the work needs to address a number of concerns relating to the correct and representative use of the datasets, presentation of methodology in a fashion sufficiently transparent to enable reproducibility, and due consideration of the need not to harm others. In this paper we study the methodological rigor and prudence in 36 academic publications from 2006-2011 that rely on malware execution. 40% of these papers appeared in the 6 highest-ranked academic security conferences. We find frequent shortcomings, including problematic assumptions regarding the use of execution-driven datasets (25% of the papers), absence of description of security precautions taken during experiments (71% of the articles), and oftentimes insufficient description of the experimental setup. Deficiencies occur in top-tier venues and elsewhere alike, highlighting a need for the community to improve its handling of malware datasets. In the hope of aiding authors, reviewers, and readers, we frame guidelines regarding transparency, realism, correctness, and safety for collecting and using malware datasets.