Biblio
Machine Learning as a Service (MLaaS) is becoming a popular practice where Service Consumers, e.g., end-users, send their data to a ML Service and receive the prediction outputs. However, the emerging usage of MLaaS has raised severe privacy concerns about users' proprietary data. PrivacyPreserving Machine Learning (PPML) techniques aim to incorporate cryptographic primitives such as Homomorphic Encryption (HE) and Multi-Party Computation (MPC) into ML services to address privacy concerns from a technology standpoint. Existing PPML solutions have not been widely adopted in practice due to their assumed high overhead and integration difficulty within various ML front-end frameworks as well as hardware backends. In this work, we propose PlaidML-HE, the first end-toend HE compiler for PPML inference. Leveraging the capability of Domain-Specific Languages, PlaidML-HE enables automated generation of HE kernels across diverse types of devices. We evaluate the performance of PlaidML-HE on different ML kernels and demonstrate that PlaidML-HE greatly reduces the overhead of the HE primitive compared to the existing implementations.
In this paper, we explore the use of machine learning technique for wormhole attack detection in ad hoc network. This work has categorized into three major tasks. One of our tasks is a simulation of wormhole attack in an ad hoc network environment with multiple wormhole tunnels. A next task is the characterization of packet attributes that lead to feature selection. Consequently, we perform data generation and data collection operation that provide large volume dataset. The final task is applied to machine learning technique for wormhole attack detection. Prior to this, a wormhole attack has detected using traditional approaches. In those, a Multirate-DelPHI is shown best results as detection rate is 90%, and the false alarm rate is 20%. We conduct experiments and illustrate that our method performs better resulting in all statistical parameters such as detection rate is 93.12% and false alarm rate is 5.3%. Furthermore, we have also shown results on various statistical parameters such as Precision, F-measure, MCC, and Accuracy.
Intellectual property (IP) and integrated circuit (IC) piracy are of increasing concern to IP/IC providers because of the globalization of IC design flow and supply chains. Such globalization is driven by the cost associated with the design, fabrication, and testing of integrated circuits and allows avenues for piracy. To protect the designs against IC piracy, we propose a fingerprinting scheme based on side-channel power analysis and machine learning methods. The proposed method distinguishes the ICs which realize a modified netlist, yet same functionality. Our method doesn't imply any hardware overhead. We specifically focus on the ability to detect minimal design variations, as quantified by the number of logic gates changed. Accuracy of the proposed scheme is greater than 96 percent, and typically 99 percent in detecting one or more gate-level netlist changes. Additionally, the effect of temperature has been investigated as part of this work. Results depict 95.4 percent accuracy in detecting the exact number of gate changes when data and classifier use the same temperature, while training with different temperatures results in 33.6 percent accuracy. This shows the effectiveness of building temperature-dependent classifiers from simulations at known operating temperatures.
Machine learning, specifically deep learning is becoming a key technology component in application domains such as identity management, finance, automotive, and healthcare, to name a few. Proprietary machine learning models - Machine Learning IP - are developed and deployed at the network edge, end devices and in the cloud, to maximize user experience. With the proliferation of applications embedding Machine Learning IPs, machine learning models and hyper-parameters become attractive to attackers, and require protection. Major players in the semiconductor industry provide mechanisms on device to protect the IP at rest and during execution from being copied, altered, reverse engineered, and abused by attackers. In this work we explore system security architecture mechanisms and their applications to Machine Learning IP protection.
Recently in the vast advancement of Artificial Intelligence, Machine learning and Deep Neural Network (DNN) driven us to the robust applications. Such as Image processing, speech recognition, and natural language processing, DNN Algorithms has succeeded in many drawbacks; especially the trained DNN models have made easy to the researchers to produces state-of-art results. However, sharing these trained models are always a challenging task, i.e. security, and protection. We performed extensive experiments to present some analysis of watermark in DNN. We proposed a DNN model for Digital watermarking which investigate the intellectual property of Deep Neural Network, Embedding watermarks, and owner verification. This model can generate the watermarks to deal with possible attacks (fine tuning and train to embed). This approach is tested on the standard dataset. Hence this model is robust to above counter-watermark attacks. Our model accurately and instantly verifies the ownership of all the remotely expanded deep learning models without affecting the model accuracy for standard information data.
This paper presents a computational model for managing an Embodied Conversational Agent's first impressions of warmth and competence towards the user. These impressions are important to manage because they can impact users' perception of the agent and their willingness to continue the interaction with the agent. The model aims at detecting user's impression of the agent and producing appropriate agent's verbal and nonverbal behaviours in order to maintain a positive impression of warmth and competence. User's impressions are recognized using a machine learning approach with facial expressions (action units) which are important indicators of users' affective states and intentions. The agent adapts in real-time its verbal and nonverbal behaviour, with a reinforcement learning algorithm that takes user's impressions as reward to select the most appropriate combination of verbal and non-verbal behaviour to perform. A user study to test the model in a contextualized interaction with users is also presented. Our hypotheses are that users' ratings differs when the agents adapts its behaviour according to our reinforcement learning algorithm, compared to when the agent does not adapt its behaviour to user's reactions (i.e., when it randomly selects its behaviours). The study shows a general tendency for the agent to perform better when using our model than in the random condition. Significant results shows that user's ratings about agent's warmth are influenced by their a-priori about virtual characters, as well as that users' judged the agent as more competent when it adapted its behaviour compared to random condition.
Network covert channels are used in various cyberattacks, including disclosure of sensitive information and enabling stealth tunnels for botnet commands. With time and technology, covert channels are becoming more prevalent, complex, and difficult to detect. The current methods for detection are protocol and pattern specific. This requires the investment of significant time and resources into application of various techniques to catch the different types of covert channels. This paper reviews several patterns of network storage covert channels, describes generation of network traffic dataset with covert channels, and proposes a generic, protocol-independent approach for the detection of network storage covert channels using a supervised machine learning technique. The implementation of the proposed generic detection model can lead to a reduction of necessary techniques to prevent covert channel communication in network traffic. The datasets we have generated for experimentation represent storage covert channels in the IP, TCP, and DNS protocols and are available upon request for future research in this area.
In this study, delays between data packets were read by using different window sizes to detect data transmitted from covert timing channel in computer networks, and feature vectors were extracted from them and detection of hidden data by some classification algorithms was achieved with high performance rate.
In recent trends, privacy preservation is the most predominant factor, on big data analytics and cloud computing. Every organization collects personal data from the users actively or passively. Publishing this data for research and other analytics without removing Personally Identifiable Information (PII) will lead to the privacy breach. Existing anonymization techniques are failing to maintain the balance between data privacy and data utility. In order to provide a trade-off between the privacy of the users and data utility, a Mondrian based k-anonymity approach is proposed. To protect the privacy of high-dimensional data Deep Neural Network (DNN) based framework is proposed. The experimental result shows that the proposed approach mitigates the information loss of the data without compromising privacy.
In the last few years, cryptocurrency mining has become more and more important on the Internet activity and nowadays is even having a noticeable impact on the global economy. This has motivated the emergence of a new malicious activity called cryptojacking, which consists of compromising other machines connected to the Internet and leverage their resources to mine cryptocurrencies. In this context, it is of particular interest for network administrators to detect possible cryptocurrency miners using network resources without permission. Currently, it is possible to detect them using IP address lists from known mining pools, processing information from DNS traffic, or directly performing Deep Packet Inspection (DPI) over all the traffic. However, all these methods are still ineffective to detect miners using unknown mining servers or result too expensive to be deployed in real-world networks with large traffic volume. In this paper, we present a machine learning-based method able to detect cryptocurrency miners using NetFlow/IPFIX network measurements. Our method does not require to inspect the packets' payload; as a result, it achieves cost-efficient miner detection with similar accuracy than DPI-based techniques.
The clear, social, and dark web have lately been identified as rich sources of valuable cyber-security information that -given the appropriate tools and methods-may be identified, crawled and subsequently leveraged to actionable cyber-threat intelligence. In this work, we focus on the information gathering task, and present a novel crawling architecture for transparently harvesting data from security websites in the clear web, security forums in the social web, and hacker forums/marketplaces in the dark web. The proposed architecture adopts a two-phase approach to data harvesting. Initially a machine learning-based crawler is used to direct the harvesting towards websites of interest, while in the second phase state-of-the-art statistical language modelling techniques are used to represent the harvested information in a latent low-dimensional feature space and rank it based on its potential relevance to the task at hand. The proposed architecture is realised using exclusively open-source tools, and a preliminary evaluation with crowdsourced results demonstrates its effectiveness.
Machine learning is a major area in artificial intelligence, which enables computer to learn itself explicitly without programming. As machine learning is widely used in making decision automatically, attackers have strong intention to manipulate the prediction generated my machine learning model. In this paper we study about the different types of attacks and its countermeasures on machine learning model. By research we found that there are many security threats in various algorithms such as K-nearest-neighbors (KNN) classifier, random forest, AdaBoost, support vector machine (SVM), decision tree, we revisit existing security threads and check what are the possible countermeasures during the training and prediction phase of machine learning model. In machine learning model there are 2 types of attacks that is causative attack which occurs during the training phase and exploratory attack which occurs during the prediction phase, we will also discuss about the countermeasures on machine learning model, the countermeasures are data sanitization, algorithm robustness enhancement, and privacy preserving techniques.
In order to protect individuals' privacy, data have to be "well-sanitized" before sharing them, i.e. one has to remove any personal information before sharing data. However, it is not always clear when data shall be deemed well-sanitized. In this paper, we argue that the evaluation of sanitized data should be based on whether the data allows the inference of sensitive information that is specific to an individual, instead of being centered around the concept of re-identification. We propose a framework to evaluate the effectiveness of different sanitization techniques on a given dataset by measuring how much an individual's record from the sanitized dataset influences the inference of his/her own sensitive attribute. Our intent is not to accurately predict any sensitive attribute but rather to measure the impact of a single record on the inference of sensitive information. We demonstrate our approach by sanitizing two real datasets in different privacy models and evaluate/compare each sanitized dataset in our framework.
With the growing number of streaming services, internet providers are increasingly needing to be able to identify the types of data and content providers that are being used on their networks. Traditional methods, such as IP and port scanning, are not always available for clients using VPNs or with providers using varying IP addresses. As such, in this paper we explore a potential method using neural networks and Markov Decision Process in order to augment deep packet inspection techniques in identifying the source and class of video streaming services.
Accurate network traffic identification is an important basis for network traffic monitoring and data analysis, and is the key to improve the quality of user service. In this paper, through the analysis of two network traffic identification methods based on machine learning and deep packet inspection, a network traffic identification method based on machine learning and deep packet inspection is proposed. This method uses deep packet inspection technology to identify most network traffic, reduces the workload that needs to be identified by machine learning method, and deep packet inspection can identify specific application traffic, and improves the accuracy of identification. Machine learning method is used to assist in identifying network traffic with encryption and unknown features, which makes up for the disadvantage of deep packet inspection that can not identify new applications and encrypted traffic. Experiments show that this method can improve the identification rate of network traffic.