Biblio
Internet Service Providers (ISPs) have an economic and operational interest in detecting malicious network activity relating to their subscribers. However, it is unclear what kind of traffic data an ISP has available for cyber-security research, and under which legal conditions it can be used. This paper gives an overview of the challenges posed by legislation and of the data sources available to a European ISP. DNS and NetFlow logs are identified as relevant data sources and the state of the art in anonymization and fingerprinting techniques is discussed. Based on legislation, data availability and privacy considerations, a practically applicable anonymization policy is presented.
Intrusion Detection Systems (IDSs) are an important defense tool against the sophisticated and ever-growing network attacks. These systems need to be evaluated against high quality datasets for correctly assessing their usefulness and comparing their performance. We present an Intrusion Detection Dataset Toolkit (ID2T) for the creation of labeled datasets containing user defined synthetic attacks. The architecture of the toolkit is provided for examination and the example of an injected attack, in real network traffic, is visualized and analyzed. We further discuss the ability of the toolkit of creating realistic synthetic attacks of high quality and low bias.