Biblio
The field of Big Data is expanding at an alarming rate since its inception in 2012. The excessive use of Social Networking Sites, collection of Data from Sensors for analysis and prediction of future events, improvement in Customer Satisfaction on Online S hopping portals by monitoring their past behavior and providing them information, items and offers of their interest instantaneously, etc had led to this rise in the field of Big Data. This huge amount of data, if analyzed and processed properly, can lead to decisions and outcomes that would be of great values and benefits to organizations and individuals. Security of Data and Privacy of User is of keen interest and high importance for individuals, industry and academia. Everyone ensure that their Sensitive information must be kept away from unauthorized access and their assets must be kept safe from security breaches. Privacy and Security are also equally important for Big Data and here, it is typical and complex to ensure the Privacy and Security, as the amount of data is enormous. One possible option to effectively and efficiently handle, process and analyze the Big Data is to make use of Machine Learning techniques. Machine Learning techniques are straightforward; applying them on Big Data requires resolution of various issues and is a challenging task, as the size of Data is too big. This paper provides a brief introduction to Big Data, the importance of Security and Privacy in Big Data and the various challenges that are required to overcome for applying the Machine Learning techniques on Big Data.
This paper has firstly introduced big data services and cloud computing model based on different process forms, and analyzed the authentication technology and security services of the existing big data to understand their processing characteristics. Operation principles and complexity of the big data services and cloud computing have also been studied, and summary about their suitable environment and pros and cons have been made. Based on the Cloud Computing, the author has put forward the Model of Big Data Cloud Computing based on Extended Subjective Logic (MBDCC-ESL), which has introduced Jφsang's subjective logic to test the data credibility and expanded it to solve the problem of the trustworthiness of big data in the cloud computing environment. Simulation results show that the model works pretty well.
Internet of Things devices and data sources areseeing increased use in various application areas. The pro-liferation of cheaper sensor hardware has allowed for widerscale data collection deployments. With increased numbers ofdeployed sensors and the use of heterogeneous sensor typesthere is increased scope for collecting erroneous, inaccurate orinconsistent data. This in turn may lead to inaccurate modelsbuilt from this data. It is important to evaluate this data asit is collected to determine its validity. This paper presents ananalysis of data quality as it is represented in Internet of Things(IoT) systems and some of the limitations of this representation. The paper discusses the use of trust as a heuristic to drive dataquality measurements. Trust is a well-established metric that hasbeen used to determine the validity of a piece or source of datain crowd sourced or other unreliable data collection techniques. The analysis extends to detail an appropriate framework forrepresenting data quality effectively within the big data modeland why a trust backed framework is important especially inheterogeneously sourced IoT data streams.
Industrial control systems (ICSs) are used in various infrastructures and industrial plants for realizing their control operation and ensuring their safety. Concerns about the cybersecurity of industrial control systems have raised due to the increased number of cyber-attack incidents on critical infrastructures in the light of the advancement in the cyber activity of ICSs. Nevertheless, the operation of the industrial control systems is bind to vital aspects in life, which are safety, economy, and security. This paper presents a semi-supervised, hybrid attack detection approach for industrial control systems by combining Isolation Forest and Convolutional Neural Network (CNN) models. The proposed framework is developed using the normal operational data, and it is composed of a feature extraction model implemented using a One-Dimensional Convolutional Neural Network (1D-CNN) and an isolation forest model for the detection. The two models are trained independently such that the feature extraction model aims to extract useful features from the continuous-time signals that are then used along with the binary actuator signals to train the isolation forest-based detection model. The proposed approach is applied to a down-scaled industrial control system, which is a water treatment plant known as the Secure Water Treatment (SWaT) testbed. The performance of the proposed method is compared with the other works using the same testbed, and it shows an improvement in terms of the detection capability.
With the advent of Industry 4.0, the Internet of Things (IoT) and Artificial Intelligence (AI), smart entities are now able to read the minds of users via extracting cognitive patterns from electroencephalogram (EEG) signals. Such brain data may include users' experiences, emotions, motivations, and other previously private mental and psychological processes. Accordingly, users' cognitive privacy may be violated and the right to cognitive privacy should protect individuals against the unconsented intrusion by third parties into the brain data as well as against the unauthorized collection of those data. This has caused a growing concern among users and industry experts that laws to protect the right to cognitive liberty, right to mental privacy, right to mental integrity, and the right to psychological continuity. In this paper, we propose an AI-enabled EEG model, namely Cognitive Privacy, that aims to protect data and classifies users and their tasks from EEG data. We present a model that protects data from disclosure using normalized correlation analysis and classifies subjects (i.e., a multi-classification problem) and their tasks (i.e., eye open and eye close as a binary classification problem) using a long-short term memory (LSTM) deep learning approach. The model has been evaluated using the EEG data set of PhysioNet BCI, and the results have revealed its high performance of classifying users and their tasks with achieving high data privacy.
Avoiding security vulnerabilities is very important for embedded systems. Dynamic Information Flow Tracking (DIFT) is a powerful technique to analyze SW with respect to security policies in order to protect the system against a broad range of security related exploits. However, existing DIFT approaches either do not exist for Virtual Prototypes (VPs) or fail to model complex hardware/software interactions.In this paper, we present a novel approach that enables early and accurate DIFT of binaries targeting embedded systems with custom peripherals. Leveraging the SystemC framework, our DIFT engine tracks accurate data flow information alongside the program execution to detect violations of security policies at run-time. We demonstrate the effectiveness and applicability of our approach by extensive experiments.
Machine learning techniques help to understand underlying patterns in datasets to develop defense mechanisms against cyber attacks. Multilayer Perceptron (MLP) technique is a machine learning technique used in detecting attack vs. benign data. However, it is difficult to construct any effective model when there are imbalances in the dataset that prevent proper classification of attack samples in data. In this research, we use UGR'16 dataset to conduct data wrangling initially. This technique helps to prepare a test set from the original dataset to train the neural network model effectively. We experimented with a series of inputs of varying sizes (i.e. 10000, 50000, 1 million) to observe the performance of the MLP neural network model with distribution of features over accuracy. Later, we use Generative Adversarial Network (GAN) model that produces samples of different attack labels (e.g. blacklist, anomaly spam, ssh scan) for balancing the dataset. These samples are generated based on data from the UGR'16 dataset. Further experiments with MLP neural network model shows that a balanced attack sample dataset, made possible with GAN, produces more accurate results than an imbalanced one.
Modern industrial control systems (ICS) act as victims of cyber attacks more often in last years. These cyber attacks often can not be detected by classical information security methods. Moreover, the consequences of cyber attack's impact can be catastrophic. Since cyber attacks leads to appearance of anomalies in the ICS and technological equipment controlled by it, the task of intrusion detection for ICS can be reformulated as the task of industrial process anomaly detection. This paper considers the applicability of generative adversarial networks (GANs) in the field of industrial processes anomaly detection. Existing approaches for GANs usage in the field of information security (such as anomaly detection in network traffic) were described. It is proposed to use the BiGAN architecture in order to detect anomalies in the industrial processes. The proposed approach has been tested on Secure Water Treatment Dataset (SWaT). The obtained results indicate the prospects of using the examined method in practice.
With the rapid progress of informatization construction in power business, data resource has become the basic strategic resource of the power industry and innovative element in power production. The security protection of data in power business is particularly important in the informatization construction of power business. In order to implement data security protection, transparent encryption is one of the fifteen key technical standards in the Construction Guideline of the Standard Network Data Security System. However, data storage in the encrypted state is bound to affect the security audit of data to a certain extent. Based on this problem, this paper proposes a scheme to audit the sensitivity of the power business data under the protection of encryption to achieve an efficient sensitivity audit of ciphertext data with the premise of not revealing the decryption key or data information. Through a security demonstration, this paper fully proves that this solution is secure under the known plaintext attacks.
This paper proposes a deep learning-based white-hat worm launcher in Botnet Defense System (BDS). BDS uses white-hat botnets to defend an IoT system against malicious botnets. White-hat worm launcher literally launches white-hat worms to create white-hat botnets according to the strategy decided by BDS. The proposed launcher learns with deep learning where is the white-hat worms' right place to successfully drive out malicious botnets. Given a system situation invaded by malicious botnets, it predicts a worms' placement by the learning result and launches them. We confirmed the effect of the proposed launcher through simulating evaluation.
Botnets are one of the major threats on the Internet. They are used for malicious activities to compromise the basic network security goals, namely Confidentiality, Integrity, and Availability. For reliable botnet detection and defense, deep learning-based approaches were recently proposed. In this paper, four different deep learning models, namely Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), hybrid CNN-LSTM, and Multi-layer Perception (MLP) are applied for botnet detection and simulation studies are carried out using the CTU-13 botnet traffic dataset. We use several performance metrics such as accuracy, sensitivity, specificity, precision, and F1 score to evaluate the performance of each model on classifying both known and unknown (zero-day) botnet traffic patterns. The results show that our deep learning models can accurately and reliably detect both known and unknown botnet traffic, and show better performance than other deep learning models.