Biblio
Smart technologies at hand have facilitated generation and collection of huge volumes of data, on daily basis. It involves highly sensitive and diverse data like personal, organisational, environment, energy, transport and economic data. Data Analytics provide solution for various issues being faced by smart cities like crisis response, disaster resilience, emergence management, smart traffic management system etc.; it requires distribution of sensitive data among various entities within or outside the smart city,. Sharing of sensitive data creates a need for efficient usage of smart city data to provide smart applications and utility to the end users in a trustworthy and safe mode. This shared sensitive data if get leaked as a consequence can cause damage and severe risk to the city's resources. Fortification of critical data from unofficial disclosure is biggest issue for success of any project. Data Leakage Detection provides a set of tools and technology that can efficiently resolves the concerns related to smart city critical data. The paper, showcase an approach to detect the leakage which is caused intentionally or unintentionally. The model represents allotment of data objects between diverse agents using Bigraph. The objective is to make critical data secure by revealing the guilty agent who caused the data leakage.
This article presents the valuable experience and practical results of exploratory research by authors on the scientific problem of cyber-resilient (Cyber Resilience) critical information infrastructure in the previously unknown heterogeneous mass cyber attacks of attackers based on similarity invariants. It is essential that the results obtained significantly complement the well-known practices and recommendations of ISO 22301 (https://www.iso.org), MITER PR 15-1334 (www.mitre.org) and NIST SP 800-160 (www.nist.gov) in terms of developing quantitative metrics and cyber resistance measures. This allows you to open and formally present the ultimate law of the effectiveness of ensuring the cyber stability of modern systems of Industry 4.0. in the face of growing security threats.
More and more security and privacy issues are arising as new technologies, such as big data and cloud computing, are widely applied in nowadays. For decreasing the privacy breaches in access control system under opening and cross-domain environment. In this paper, we suggest a game and risk based access model for privacy preserving by employing Shannon information and game theory. After defining the notions of Privacy Risk and Privacy Violation Access, a high-level framework of game theoretical risk based access control is proposed. Further, we present formulas for estimating the risk value of access request and user, construct and analyze the game model of the proposed access control by using a multi-stage two player game. There exists sub-game perfect Nash equilibrium each stage in the risk based access control and it's suitable to protect the privacy by limiting the privacy violation access requests.
Finite-state machine (FSM) is widely used as control unit in most digital designs. Many intellectual property protection and obfuscation techniques leverage on the exponential number of possible states and state transitions of large FSM to secure a physical design with the reason that it is challenging to retrieve the FSM design from its downstream design or physical implementation without knowledge of the design. In this paper, we postulate that this assumption may not be sustainable with big data analytics. We demonstrate by applying a data mining technique to analyze sufficiently large amount of data collected from a full scan design to identify its FSM state registers. An impact metric is introduced to discriminate FSM state registers from other registers. A decision tree algorithm is constructed from the scan data for the regression analysis of the dependency of other registers on a chosen register to deduce its impact. The registers with the greater impact are more likely to be the FSM state registers. The proposed scheme is applied on several complex designs from OpenCores. The experiment results show the feasibility of our scheme in correctly identifying most FSM state registers with a high hit rate for a large majority of the designs.
Severe class imbalance between the majority and minority classes in large datasets can prejudice Machine Learning classifiers toward the majority class. Our work uniquely consolidates two case studies, each utilizing three learners implemented within an Apache Spark framework, six sampling methods, and five sampling distribution ratios to analyze the effect of severe class imbalance on big data analytics. We use three performance metrics to evaluate this study: Area Under the Receiver Operating Characteristic Curve, Area Under the Precision-Recall Curve, and Geometric Mean. In the first case study, models were trained on one dataset (POST) and tested on another (SlowlorisBig). In the second case study, the training and testing dataset roles were switched. Our comparison of performance metrics shows that Area Under the Precision-Recall Curve and Geometric Mean are sensitive to changes in the sampling distribution ratio, whereas Area Under the Receiver Operating Characteristic Curve is relatively unaffected. In addition, we demonstrate that when comparing sampling methods, borderline-SMOTE2 outperforms the other methods in the first case study, and Random Undersampling is the top performer in the second case study.
Continuous and adaptive learning is an effective learning approach when dealing with highly dynamic and changing scenarios, where concept drift often happens. In a continuous, stream or adaptive learning setup, new measurements arrive continuously and there are no boundaries for learning, meaning that the learning model has to decide how and when to (re)learn from these new data constantly. We address the problem of adaptive and continual learning for network security, building dynamic models to detect network attacks in real network traffic. The combination of fast and big network measurements data with the re-training paradigm of adaptive learning imposes complex challenges in terms of data processing speed, which we tackle by relying on big data platforms for parallel stream processing. We build and benchmark different adaptive learning models on top of a novel big data analytics platform for network traffic monitoring and analysis tasks, and show that high speed-up computations (as high as × 6) can be achieved by parallelizing off-the-shelf stream learning approaches.
This paper introduces a secured and distributed Big Data storage scheme with multiple authorizations. It divides the Big Data into small chunks and distributes them through multiple Cloud locations. The Shamir's Secret Sharing and Secure Hash Algorithm are employed to provide the security and authenticity of this work. The proposed methodology consists of two phases: the distribution and retrieving phases. The distribution phase comprises three operations of dividing, encrypting, and distribution. The retrieving phase performs collecting and verifying operations. To increase the security level, the encryption key is divided into secret shares using Shamir's Algorithm. Moreover, the Secure Hash Algorithm is used to verify the Big Data after retrieving from the Cloud. The experimental results show that the proposed design can reconstruct a distributed Big Data with good speed while conserving the security and authenticity properties.
In the open network environment, the network offensive information is implanted in big data environment, so it is necessary to carry out accurate location marking of network offensive information, to realize network attack detection, and to implement the process of accurate location marking of network offensive information. Combined with big data analysis method, the location of network attack nodes is realized, but when network attacks cross in series, the performance of attack information tagging is not good. An accurate marking technique for network attack information is proposed based on big data fusion tracking recognition. The adaptive learning model combined with big data is used to mark and sample the network attack information, and the feature analysis model of attack information chain is designed by extracting the association rules. This paper classifies the data types of the network attack nodes, and improves the network attack detection ability by the task scheduling method of the network attack information nodes, and realizes the accurate marking of the network attacking information. Simulation results show that the proposed algorithm can effectively improve the accuracy of marking offensive information in open network environment, the efficiency of attack detection and the ability of intrusion prevention is improved, and it has good application value in the field of network security defense.