Biblio
Cybersecurity community is slowly leveraging Machine Learning (ML) to combat ever evolving threats. One of the biggest drivers for successful adoption of these models is how well domain experts and users are able to understand and trust their functionality. As these black-box models are being employed to make important predictions, the demand for transparency and explainability is increasing from the stakeholders.Explanations supporting the output of ML models are crucial in cyber security, where experts require far more information from the model than a simple binary output for their analysis. Recent approaches in the literature have focused on three different areas: (a) creating and improving explainability methods which help users better understand the internal workings of ML models and their outputs; (b) attacks on interpreters in white box setting; (c) defining the exact properties and metrics of the explanations generated by models. However, they have not covered, the security properties and threat models relevant to cybersecurity domain, and attacks on explainable models in black box settings.In this paper, we bridge this gap by proposing a taxonomy for Explainable Artificial Intelligence (XAI) methods, covering various security properties and threat models relevant to cyber security domain. We design a novel black box attack for analyzing the consistency, correctness and confidence security properties of gradient based XAI methods. We validate our proposed system on 3 security-relevant data-sets and models, and demonstrate that the method achieves attacker's goal of misleading both the classifier and explanation report and, only explainability method without affecting the classifier output. Our evaluation of the proposed approach shows promising results and can help in designing secure and robust XAI methods.
Biometric authentication is the preferred authentication scheme in modern computing systems. While it offers enhanced usability, it also requires cautious handling of sensitive users' biometric templates. In this paper, a distributed scheme that eliminates the requirement for a central node that holds users' biometric templates is presented. This is replaced by an Ethereum/IPFS combination to which the templates of the users are stored in a homomorphically encrypted form. The scheme enables the biometric authentication of the users by any third party service, while the actual biometric templates of the user never leave his device in non encrypted form. Secure authentication of users in enabled, while sensitive biometric data are not exposed to anyone. Experiments show that the scheme can be applied as an authentication mechanism with minimal time overhead.
In this paper, a novel DNA based computing method is proposed for encryption of biometric color(face)and gray fingerprint images. In many applications of present scenario, gray and color images are exhibited major role for authenticating identity of an individual. The values of aforementioned images have considered as two separate matrices. The key generation process two level mathematical operations have applied on fingerprint image for generating encryption key. For enhancing security to biometric image, DNA computing has done on the above matrices generating DNA sequence. Further, DNA sequences have scrambled to add complexity to biometric image. Results of blending images, image of DNA computing has shown in experimental section. It is observed that the proposed substitution DNA computing algorithm has shown good resistant against statistical and differential attacks.
Artificial Intelligence systems have enabled significant benefits for users and society, but whilst the data for their feeding are always increasing, a side to privacy and security leaks is offered. The severe vulnerabilities to the right to privacy obliged governments to enact specific regulations to ensure privacy preservation in any kind of transaction involving sensitive information. In the case of digital and/or physical documents comprising sensitive information, the right to privacy can be preserved by data obfuscation procedures. The capability of recognizing sensitive information for obfuscation is typically entrusted to the experience of human experts, who are over-whelmed by the ever increasing amount of documents to process. Artificial intelligence could proficiently mitigate the effort of the human officers and speed up processes. Anyway, until enough knowledge won't be available in a machine readable format, automatic and effectively working systems can't be developed. In this work we propose a methodology for transferring and leveraging general knowledge across specific-domain tasks. We built, from scratch, specific-domain knowledge data sets, for training artificial intelligence models supporting human experts in privacy preserving tasks. We exploited a mixture of natural language processing techniques applied to unlabeled domain-specific documents corpora for automatically obtain labeled documents, where sensitive information are recognized and tagged. We performed preliminary tests just over 10.000 documents from the healthcare and justice domains. Human experts supported us during the validation. Results we obtained, estimated in terms of precision, recall and F1-score metrics across these two domains, were promising and encouraged us to further investigations.
Experimentation focused on assessing the value of complex visualisation approaches when compared with alternative methods for data analysis is challenging. The interaction between participant prior knowledge and experience, a diverse range of experimental or real-world data sets and a dynamic interaction with the display system presents challenges when seeking timely, affordable and statistically relevant experimentation results. This paper outlines a hybrid approach proposed for experimentation with complex interactive data analysis tools, specifically for computer network traffic analysis. The approach involves a structured survey completed after free engagement with the software platform by expert participants. The survey captures objective and subjective data points relating to the experience with the goal of making an assessment of software performance which is supported by statistically significant experimental results. This work is particularly applicable to field of network analysis for cyber security and also military cyber operations and intelligence data analysis.
Existing cyber security solutions have been basically developed using knowledge-based models that often cannot trigger new cyber-attack families. With the boom of Artificial Intelligence (AI), especially Deep Learning (DL) algorithms, those security solutions have been plugged-in with AI models to discover, trace, mitigate or respond to incidents of new security events. The algorithms demand a large number of heterogeneous data sources to train and validate new security systems. This paper presents the description of new datasets, the so-called ToNİoT, which involve federated data sources collected from Telemetry datasets of IoT services, Operating system datasets of Windows and Linux, and datasets of Network traffic. The paper introduces the testbed and description of TONİoT datasets for Windows operating systems. The testbed was implemented in three layers: edge, fog and cloud. The edge layer involves IoT and network devices, the fog layer contains virtual machines and gateways, and the cloud layer involves cloud services, such as data analytics, linked to the other two layers. These layers were dynamically managed using the platforms of software-Defined Network (SDN) and Network-Function Virtualization (NFV) using the VMware NSX and vCloud NFV platform. The Windows datasets were collected from audit traces of memories, processors, networks, processes and hard disks. The datasets would be used to evaluate various AI-based cyber security solutions, including intrusion detection, threat intelligence and hunting, privacy preservation and digital forensics. This is because the datasets have a wide range of recent normal and attack features and observations, as well as authentic ground truth events. The datasets can be publicly accessed from this link [1].
We leverage deep learning algorithms on various user behavioral information gathered from end-user devices to classify a subject of interest. In spite of the ability of these techniques to counter spoofing threats, they are vulnerable to adversarial learning attacks, where an attacker adds adversarial noise to the input samples to fool the classifier into false acceptance. Recently, a handful of mature techniques like Fast Gradient Sign Method (FGSM) have been proposed to aid white-box attacks, where an attacker has a complete knowledge of the machine learning model. On the contrary, we exploit a black-box attack to a behavioral biometric system based on gait patterns, by using FGSM and training a shadow model that mimics the target system. The attacker has limited knowledge on the target model and no knowledge of the real user being authenticated, but induces a false acceptance in authentication. Our goal is to understand the feasibility of a black-box attack and to what extent FGSM on shadow models would contribute to its success. Our results manifest that the performance of FGSM highly depends on the quality of the shadow model, which is in turn impacted by key factors including the number of queries allowed by the target system in order to train the shadow model. Our experimentation results have revealed strong relationships between the shadow model and FGSM performance, as well as the effect of the number of FGSM iterations used to create an attack instance. These insights also shed light on deep-learning algorithms' model shareability that can be exploited to launch a successful attack.
Advancements in the AI field unfold tremendous opportunities for society. Simultaneously, it becomes increasingly important to address emerging ramifications. Thereby, the focus is often set on ethical and safe design forestalling unintentional failures. However, cybersecurity-oriented approaches to AI safety additionally consider instantiations of intentional malice – including unethical malevolent AI design. Recently, an analogous emphasis on malicious actors has been expressed regarding security and safety for virtual reality (VR). In this vein, while the intersection of AI and VR (AIVR) offers a wide array of beneficial cross-fertilization possibilities, it is responsible to anticipate future malicious AIVR design from the onset on given the potential socio-psycho-technological impacts. For a simplified illustration, this paper analyzes the conceivable use case of Generative AI (here deepfake techniques) utilized for disinformation in immersive journalism. In our view, defenses against such future AIVR safety risks related to falsehood in immersive settings should be transdisciplinarily conceived from an immersive co-creation stance. As a first step, we motivate a cybersecurity-oriented procedure to generate defenses via immersive design fictions. Overall, there may be no panacea but updatable transdisciplinary tools including AIVR itself could be used to incrementally defend against malicious actors in AIVR.
Supply chain security threats pose new challenges to security risk modeling techniques for complex ICT systems such as the IoT. With established techniques drawn from attack trees and reliability analysis providing needed points of reference, graph-based analysis can provide a framework for considering the role of suppliers in such systems. We present such a framework here while highlighting the need for a component-centered model. Given resource limitations when applying this model to existing systems, we study various classes of uncertainties in model development, including structural uncertainties and uncertainties in the magnitude of estimated event probabilities. Using case studies, we find that structural uncertainties constitute a greater challenge to model utility and as such should receive particular attention. Best practices in the face of these uncertainties are proposed.