Biblio
The risk of cyber-attacks exploiting vulnerable organisations has increased significantly over the past several years. These attacks may combine to exploit a vulnerability breach within a system's protection strategy, which has the potential for loss, damage or destruction of assets. Consequently, every vulnerability has an accompanying risk, which is defined as the "intersection of assets, threats, and vulnerabilities" [1]. This research project aims to experimentally compare the similarity-based ranking of cyber security information utilising a recommendation environment. The Memory-Based Collaborative Filtering technique was employed, specifically the User-Based and Item-Based approaches. These systems utilised information from the National Vulnerability Database, specifically for the identification and similarity-based ranking of cyber-security vulnerability information, relating to hardware and software applications. Experiments were performed using the Item-Based technique, to identify the optimum system parameters, evaluated through the AUC evaluation metric. Once identified, the Item-Based technique was compared with the User-Based technique which utilised the parameters identified from the previous experiments. During these experiments, the Pearson's Correlation Coefficient and the Cosine similarity measure was used. From these experiments, it was identified that utilised the Item-Based technique which employed the Cosine similarity measure, an AUC evaluation metric of 0.80225 was achieved.
Collaborative filtering (CF) recommender system has been widely used for its well performing in personalized recommendation, but CF recommender system is vulnerable to shilling attacks in which shilling attack profiles are injected into the system by attackers to affect recommendations. Design robust recommender system and propose attack detection methods are the main research direction to handle shilling attacks, among which unsupervised PCA is particularly effective in experiment, but if we have no information about the number of shilling attack profiles, the unsupervised PCA will be suffered. In this paper, a new unsupervised detection method which combine PCA and data complexity has been proposed to detect shilling attacks. In the proposed method, PCA is used to select suspected attack profiles, and data complexity is used to pick out the authentic profiles from suspected attack profiles. Compared with the traditional PCA, the proposed method could perform well and there is no need to determine the number of shilling attack profiles in advance.
Code churn has been successfully used to identify defect inducing changes in software development. Our recent analysis of the cross-release code churn showed that several design metrics exhibit moderate correlation with the number of defects in complex systems. The goal of this paper is to explore whether cross-release code churn can be used to identify critical design change and contribute to prediction of defects for software in evolution. In our case study, we used two types of data from consecutive releases of open-source projects, with and without cross-release code churn, to build standard prediction models. The prediction models were trained on earlier releases and tested on the following ones, evaluating the performance in terms of AUC, GM and effort aware measure Pop. The comparison of their performance was used to answer our research question. The obtained results showed that the prediction model performs better when cross-release code churn is included. Practical implication of this research is to use cross-release code churn to aid in safe planning of next release in software development.
With so much our daily lives relying on digital devices like personal computers and cell phones, there is a growing demand for code that not only functions properly, but is secure and keeps user data safe. However, ensuring this is not such an easy task, and many developers do not have the required skills or resources to ensure their code is secure. Many code analysis tools have been written to find vulnerabilities in newly developed code, but this technology tends to produce many false positives, and is still not able to identify all of the problems. Other methods of finding software vulnerabilities automatically are required. This proof-of-concept study applied natural language processing on Java byte code to locate SQL injection vulnerabilities in a Java program. Preliminary findings show that, due to the high number of terms in the dataset, using singular decision trees will not produce a suitable model for locating SQL injection vulnerabilities, while random forest structures proved more promising. Still, further work is needed to determine the best classification tool.
This paper presents Checked C, an extension to C designed to support spatial safety, implemented in Clang and LLVM. Checked C's design is distinguished by its focus on backward-compatibility, incremental conversion, developer control, and enabling highly performant code. Like past approaches to a safer C, Checked C employs a form of checked pointer whose accesses can be statically or dynamically verified. Performance evaluation on a set of standard benchmark programs shows overheads to be relatively low. More interestingly, Checked C introduces the notions of a checked region and bounds-safe interfaces.
Approaches for the automatic analysis of security policies on source code level cannot trivially be applied to binaries. This is due to the lacking high-level semantics of low-level object code, and the fundamental problem that control-flow recovery from binaries is difficult. We present a novel approach to recover the control-flow of binaries that is both safe and efficient. The key idea of our approach is to use the information contained in security mechanisms to approximate the targets of computed branches. To achieve this, we first define a restricted control transition intermediate language (RCTIL), which restricts the number of possible targets for each branch to a finite number of given targets. Based on this intermediate language, we demonstrate how a safe model of the control flow can be recovered without data-flow analyses. Our evaluation shows that that makes our solution more efficient than existing solutions.
Streaming APIs are becoming more pervasive in mainstream Object-Oriented programming languages. For example, the Stream API introduced in Java 8 allows for functional-like, MapReduce-style operations in processing both finite and infinite data structures. However, using this API efficiently involves subtle considerations like determining when it is best for stream operations to run in parallel, when running operations in parallel can be less efficient, and when it is safe to run in parallel due to possible lambda expression side-effects. In this paper, we present an automated refactoring approach that assists developers in writing efficient stream code in a semantics-preserving fashion. The approach, based on a novel data ordering and typestate analysis, consists of preconditions for automatically determining when it is safe and possibly advantageous to convert sequential streams to parallel and unorder or de-parallelize already parallel streams. The approach was implemented as a plug-in to the Eclipse IDE, uses the WALA and SAFE analysis frameworks, and was evaluated on 11 Java projects consisting of ?642K lines of code. We found that 57 of 157 candidate streams (36.31%) were refactorable, and an average speedup of 3.49 on performance tests was observed. The results indicate that the approach is useful in optimizing stream code to their full potential.
This paper presents CapeVM, a sensor node virtual machine aimed at delivering both high performance and a sandboxed execution environment that ensures malicious code cannot corrupt the VM's internal state or perform actions not allowed by the VM. CapeVM uses Ahead-of-Time compilation and introduces a range of optimisations to eliminate most of the overhead present in previous work on sensor node AOT compilers. A sandboxed execution environment is guaranteed by a set of checks. The structured nature of the VM's instruction set allows the VM to perform most checks at load time, reducing the need for expensive run-time checks compared to native code approaches. While some overhead from using a VM and adding sandbox checks cannot be avoided, CapeVM's optimisations reduce this overhead dramatically. We evaluate CapeVM using a set of IoT applications and show this results in a performance just 2.1x slower than unsandboxed native code. Thus, CapeVM combines the desirable properties ofexisting work on both sandboxed execution and virtual machines for sensor nodes, with significantly improved performance.
The Java 8 Stream API sets forth a promising new programming model that incorporates functional-like, MapReduce-style features into a mainstream programming language. However, using streams correctly and efficiently may involve subtle considerations. In this poster, we present our ongoing work and preliminary results towards an automated refactoring approach that assists developers in writing optimal stream code. The approach, based on ordering and typestate analysis, determines when it is safe and advantageous to convert streams to parallel and optimize parallel streams.
In stream-based programming, data sources are abstracted as a stream of values that can be manipulated via callback functions. Stream-based programming is exploding in popularity, as it provides a powerful and expressive paradigm for handling asynchronous data sources in interactive software. However, high-level stream abstractions can also make it difficult for developers to reason about control- and data-flow relationships in their programs. This is particularly impactful when asynchronous stream-based code interacts with thread-limited features such as UI frameworks that restrict UI access to a single thread, since the threading behavior of streaming constructs is often non-intuitive and insufficiently documented. In this paper, we present a type-based approach that can statically prove the thread-safety of UI accesses in stream-based software. Our key insight is that the fluent APIs of stream-processing frameworks enable the tracking of threads via type-refinement, making it possible to reason automatically about what thread a piece of code runs on – a difficult problem in general. We implement the system as an annotation-based Java typechecker for Android programs built upon the popular ReactiveX framework and evaluate its efficacy by annotating and analyzing 8 open-source apps, where we find 33 instances of unsafe UI access while incurring an annotation burden of only one annotation per 186 source lines of code. We also report on our experience applying the typechecker to two much larger apps from the Uber Technologies, Inc. codebase, where it currently runs on every code change and blocks changes that introduce potential threading bugs.
The features of modularity and inheritance in C++ facilitate the developers' usage, but also give rise to the problem of type confusion. As an ancestor class may have a different data layout from its descendant class, a dangerous downcasting operation from the ancestor to its descendant can lead to a critical attack, such as control flow hijacking, out-of-bounds access to neighbor memory area, etc. As reported in CVE, such vulnerabilities have been found in various common-used software, including Google Chrome, Firefox and Adobe Flash Player, and have a trend of increase in recent years. The urgency of addressing type confusion problems quickens the pace of researchers coming to corresponding solutions. However, the existing works either handle the problem partially, or suffer from the high performance and memory overhead, especially to the large-scale projects. We present Bitype to check the validity explicitly when a type is downcasting to another, maintaining high coverage and reducing overhead and compilation time massively. The core of our design is a Safe Encoding Scheme, which encodes all of the classes by mapping them to bits. With this scheme, Bitype treats the classes and their safe convertible classes as codes and verifies typecastings in an xor operation, both decreasing the performance overhead of check and the memory overhead. Besides, we implement a Clang Tool to avoid the repeated collection of inheritance relationships and deploy a two-level lookup table to trace objects efficiently. Evaluated on SPEC CPU2006 benchmarks and Firefox browser, Bitype shows a slightly higher coverage of typecasting compared to the state-of-the-art HexType[22], but reduces the performance overhead by 2 to 16 times, the memory overhead by 2 to 3 times, the compilation time by 21 to 223 times. As a result, our solution is a practical and efficient typecasting checker for commodity software.
Anomaly detection on security logs is receiving more and more attention. Authentication events are an important component of security logs, and being able to produce trustful and accurate predictions minimizes the effort of cyber-experts to stop false attacks. Observed events are classified into Normal, for legitimate user behavior, and Malicious, for malevolent actions. These classes are consistently excessively imbalanced which makes the classification problem harder; in the commonly used Los Alamos dataset, the malicious class comprises only 0.00033% of the total. This work proposes a novel method to extract advanced composite features, and a supervised learning technique for classifying authentication logs trustfully; the models are Random Forest, LogitBoost, Logistic Regression, and ultimately Majority Voting which leverages the predictions of the previous models and gives the final prediction for each authentication event. We measure the performance of our experiments by using the False Negative Rate and False Positive Rate. In overall we achieve 0 False Negative Rate (i.e. no attack was missed), and on average a False Positive Rate of 0.0019.
In Vehicle-to-Vehicle (V2V) communications, malicious actors may spread false information to undermine the safety and efficiency of the vehicular traffic stream. Thus, vehicles must determine how to respond to the contents of messages which maybe false even though they are authenticated in the sense that receivers can verify contents were not tampered with and originated from a verifiable transmitter. Existing solutions to find appropriate actions are inadequate since they separately address trust and decision, require the honest majority (more honest ones than malicious), and do not incorporate driver preferences in the decision-making process. In this work, we propose a novel trust-aware decision-making framework without requiring an honest majority. It securely determines the likelihood of reported road events despite the presence of false data, and consequently provides the optimal decision for the vehicles. The basic idea of our framework is to leverage the implied effect of the road event to verify the consistency between each vehicle's reported data and actual behavior, and determine the data trustworthiness and event belief by integrating the Bayes' rule and Dempster Shafer Theory. The resulting belief serves as inputs to a utility maximization framework focusing on both safety and efficiency. This framework considers the two basic necessities of the Intelligent Transportation System and also incorporates drivers' preferences to decide the optimal action. Simulation results show the robustness of our framework under the multiple-vehicle attack, and different balances between safety and efficiency can be achieved via selecting appropriate human preference factors based on the driver's risk-taking willingness.
Networked control systems improve the efficiency of cyber-physical plants both functionally, by the availability of data generated even in far-flung locations, and operationally, by the adoption of standard protocols. A side-effect, however, is that now the safety and stability of a local process and, in turn, of the entire plant are more vulnerable to malicious agents. Leveraging the communication infrastructure, the authors here present the design of networked control systems with built-in resilience. Specifically, the paper addresses attacks known as false data injections that originate within compromised sensors. In the proposed framework for closed-loop control, the feedback signal is constructed by weighted consensus of estimates of the process state gathered from other interconnected processes. Observers are introduced to generate the state estimates from the local data. Side-channel monitors are attached to each primary sensor in order to assess proper code execution. These monitors provide estimates of the trust assigned to each observer output and, more importantly, independent of it; these estimates serve as weights in the consensus algorithm. The authors tested the concept on a multi-sensor networked physical experiment with six primary sensors. The weighted consensus was demonstrated to yield a feedback signal within specified accuracy even if four of the six primary sensors were injecting false data.
False alarm and miss are two general kinds of alarm errors and they can decrease operator's trust in the alarm system. Specifically, there are two different forms of trust in such systems, represented by two kinds of responses to alarms in this research. One is compliance and the other is reliance. Besides false alarm and miss, the two responses are differentially affected by properties of the alarm system, situational factors or operator factors. However, most of the existing studies have qualitatively analyzed the relationship between a single variable and the two responses. In this research, all available experimental studies are identified through database searches using keyword "compliance and reliance" without restriction on year of publication to December 2017. Six relevant studies and fifty-two sets of key data are obtained as the data base of this research. Furthermore, neural network is adopted as a tool to establish the quantitative relationship between multiple factors and the two forms of trust, respectively. The result will be of great significance to further study the influence of human decision making on the overall fault detection rate and the false alarm rate of the human machine system.
In the present paper, the problem of networked control system (NCS) cyber security is considered. The geometric approach is used to evaluate the security and vulnerability level of the controlled system. The proposed results are about the so-called false data injection attacks and show how imperfectly known disturbances can be used to perform undetectable, or at least stealthy, attacks that can make the NCS vulnerable to attacks from malicious outsiders. A numerical example is given to illustrate the approach.
From the last few years, security in wireless sensor network (WSN) is essential because WSN application uses important information sharing between the nodes. There are large number of issues raised related to security due to open deployment of network. The attackers disturb the security system by attacking the different protocol layers in WSN. The standard AODV routing protocol faces security issues when the route discovery process takes place. The data should be transmitted in a secure path to the destination. Therefore, to support the process we have proposed a trust based intrusion detection system (NL-IDS) for network layer in WSN to detect the Black hole attackers in the network. The sensor node trust is calculated as per the deviation of key factor at the network layer based on the Black hole attack. We use the watchdog technique where a sensor node continuously monitors the neighbor node by calculating a periodic trust value. Finally, the overall trust value of the sensor node is evaluated by the gathered values of trust metrics of the network layer (past and previous trust values). This NL-IDS scheme is efficient to identify the malicious node with respect to Black hole attack at the network layer. To analyze the performance of NL-IDS, we have simulated the model in MATLAB R2015a, and the result shows that NL-IDS is better than Wang et al. [11] as compare of detection accuracy and false alarm rate.
Software deobfuscation is a key challenge in malware analysis to understand the internal logic of the code and establish adequate countermeasures. In order to defeat recent obfuscation techniques, state-of-the-art generic deobfuscation methodologies are based on dynamic symbolic execution (DSE). However, DSE suffers from limitations such as code coverage and scalability. In the race to counter and remove the most advanced obfuscation techniques, there is a need to reduce the amount of code to cover. To that extend, we propose a novel deobfuscation approach based on semantic equivalence, called DoSE. With DoSE, we aim to improve and complement DSE-based deobfuscation techniques by statically eliminating obfuscation transformations (built on code-reuse). This improves the code coverage. Our method's novelty comes from the transposition of existing binary diffing techniques, namely semantic equivalence checking, to the purpose of the deobfuscation of untreated techniques, such as two-way opaque constructs, that we encounter in surreptitious software. In order to challenge DoSE, we used both known malwares such as Cryptowall, WannaCry, Flame and BitCoinMiner and obfuscated code samples. Our experimental results show that DoSE is an efficient strategy of detecting obfuscation transformations based on code-reuse with low rates of false positive and/or false negative results in practice, and up to 63% of code reduction on certain types of malwares.
In the light of the information revolution, and the propagation of big social data, the dissemination of misleading information is certainly difficult to control. This is due to the rapid and intensive flow of information through unconfirmed sources under the propaganda and tendentious rumors. This causes confusion, loss of trust between individuals and groups and even between governments and their citizens. This necessitates a consolidation of efforts to stop penetrating of false information through developing theoretical and practical methodologies aim to measure the credibility of users of these virtual platforms. This paper presents an approach to domain-based prediction to user's trustworthiness of Online Social Networks (OSNs). Through incorporating three machine learning algorithms, the experimental results verify the applicability of the proposed approach to classify and predict domain-based trustworthy users of OSNs.
As a valuable source of information, Word Of Mouth1 has always been valued by consumers and business marketers. The Internet provides a new medium for Word Of Mouth communication. Consumers share their views and comments on products, services, brands and enterprises through online platforms, thus forming Internet Word Of Mouth, which will be of great importance to B2C enterprises. However, disturbing and even false information as well as uncertainties and risks existing in the online communication environment lead to the crisis of online trust. Accordingly, this study constructs a trust mechanism model of Internet Word Of Mouth effect, which shows that the professionalism of communicators, online relationship strength, communication channels, and product involvement are key factors significantly affecting the Word Of Mouth effect. This model can provide theoretical guidance in the word-of-mouth marketing and the operation of B2C e-commerce enterprises.
Image hiding is the important tools to protect the ownership rights of digital multimedia contents. To reduce the interference effect of the host signal in the popular Spread Spectrum (SS) image hiding algorithm, this paper proposes an Improved Additive Spread Spectrum (IASS) image hiding algorithm. The proposed IASS image hiding algorithm maintains the simple decoder of the Additive Spread Spectrum (ASS) image hiding algorithm. This paper makes the comparative experiments with the ASS image hiding algorithm and Correlation-and-bit-Aware Spread Spectrum (CASS) image hiding algorithm. For the noise-free scenario, the proposed IASS image hiding algorithm could yield error-free decoding performance in theory. For the noise scenario, the experimental results show that the proposed IASS image hiding algorithm could significantly reduce the host effect in data hiding and improve the watermark decoding performance remarkably.
Hardware information flow analysis detects security vulnerabilities resulting from unintended design flaws, timing channels, and hardware Trojans. These information flow models are typically generated in a general way, which includes a significant amount of redundancy that is irrelevant to the specified security properties. In this work, we propose a property specific approach for information flow security. We create information flow models tailored to the properties to be verified by performing a property specific search to identify security critical paths. This helps find suspicious signals that require closer inspection and quickly eliminates portions of the design that are free of security violations. Our property specific trimming technique reduces the complexity of the security model; this accelerates security verification and restricts potential security violations to a smaller region which helps quickly pinpoint hardware security vulnerabilities.
A new approach of a formalism of hybrid automatons has been proposed for the analysis of conflict processes between the information system and the information's security malefactor. An example of probability-based assessment on malefactor's victory has been given and the possibility to abstract from a specific type of probability density function for the residence time of parties to the conflict in their possible states. A model of the distribution of destructive informational influences in the information system to connect the process of spread of destructive information processes and the process of changing subjects' states of the information system has been proposed. An example of the destructive information processes spread analysis has been given.
The purpose of this research is to propose a new mathematical model, designed to evaluate the security of cryptosystems. This model is a mixture of ideas from two basic mathematical theories, information theory and game theory. The role of information theory is assigning the model with security criteria of the cryptosystems. The role of game theory was to produce the value of the game which is representing the outcome of these criteria, which finally refers to cryptosystem's security. The proposed model support an accurate and mathematical way to evaluate the security of cryptosystems by unifying the criteria resulted from information theory and produce a unique reasonable value.