Biblio
The Java platform and its third-party libraries provide useful features to facilitate secure coding. However, misusing them can cost developers time and effort, as well as introduce security vulnerabilities in software. We conducted an empirical study on StackOverflow posts, aiming to understand developers' concerns on Java secure coding, their programming obstacles, and insecure coding practices. We observed a wide adoption of the authentication and authorization features provided by Spring Security - a third-party framework designed to secure enterprise applications. We found that programming challenges are usually related to APIs or libraries, including the complicated cross-language data handling of cryptography APIs, and the complex Java-based or XML-based approaches to configure Spring Security. In addition, we reported multiple security vulnerabilities in the suggested code of accepted answers on the StackOverflow forum. The vulnerabilities included disabling the default protection against Cross-Site Request Forgery (CSRF) attacks, breaking SSL/TLS security through bypassing certificate validation, and using insecure cryptographic hash functions. Our findings reveal the insufficiency of secure coding assistance and documentation, as well as the huge gap between security theory and coding practices.
Ensuring software security is essential for developing a reliable software. A software can suffer from security problems due to the weakness in code constructs during software development. Our goal is to relate software security with different code constructs so that developers can be aware very early of their coding weaknesses that might be related to a software vulnerability. In this study, we chose Java nano-patterns as code constructs that are method-level patterns defined on the attributes of Java methods. This study aims to find out the correlation between software vulnerability and method-level structural code constructs known as nano-patterns. We found the vulnerable methods from 39 versions of three major releases of Apache Tomcat for our first case study. We extracted nano-patterns from the affected methods of these releases. We also extracted nano-patterns from the non-vulnerable methods of Apache Tomcat, and for this, we selected the last version of three major releases (6.0.45 for release 6, 7.0.69 for release 7 and 8.0.33 for release 8) as the non-vulnerable versions. Then, we compared the nano-pattern distributions in vulnerable versus non-vulnerable methods. In our second case study, we extracted nano-patterns from the affected methods of three vulnerable J2EE web applications: Blueblog 1.0, Personalblog 1.2.6 and Roller 0.9.9, all of which were deliberately made vulnerable for testing purpose. We found that some nano-patterns such as objCreator, staticFieldReader, typeManipulator, looper, exceptions, localWriter, arrReader are more prevalent in affected methods whereas some such as straightLine are more vivid in non-affected methods. We conclude that nano-patterns can be used as the indicator of vulnerability-proneness of code.
There are several security requirements identification methods proposed by researchers in up-front requirements engineering (RE). However, in open source software (OSS) projects, developers use lightweight representation and refine requirements frequently by writing comments. They also tend to discuss security aspect in comments by providing code snippets, attachments, and external resource links. Since most security requirements identification methods in up-front RE are based on textual information retrieval techniques, these methods are not suitable for OSS projects or just-in-time RE. In our study, we propose a new model based on logistic regression to identify security requirements in OSS projects. We used five metrics to build security requirements identification models and tested the performance of these metrics by applying those models to three OSS projects. Our results show that four out of five metrics achieved high performance in intra-project testing.
The area of secure compilation aims to design compilers which produce hardened code that can withstand attacks from low-level co-linked components. So far, there is no formal correctness criterion for secure compilers that comes with a clear understanding of what security properties the criterion actually provides. Ideally, we would like a criterion that, if fulfilled by a compiler, guarantees that large classes of security properties of source language programs continue to hold in the compiled program, even as the compiled program is run against adversaries with low-level attack capabilities. This paper provides such a novel correctness criterion for secure compilers, called trace-preserving compilation (TPC). We show that TPC preserves a large class of security properties, namely all safety hyperproperties. Further, we show that TPC preserves more properties than full abstraction, the de-facto criterion used for secure compilation. Then, we show that several fully abstract compilers described in literature satisfy an additional, common property, which implies that they also satisfy TPC. As an illustration, we prove that a fully abstract compiler from a typed source language to an untyped target language satisfies TPC.
We consider the problem of designing repair efficient distributed storage systems, which are information-theoretically secure against a passive eavesdropper that can gain access to a limited number of storage nodes. We present a framework that enables design of a broad range of secure storage codes through a joint construction of inner and outer codes. As case studies, we focus on two specific families of storage codes: (i) minimum storage regenerating (MSR) codes, and (ii) maximally recoverable (MR) codes, which are a class of locally repairable codes (LRCs). The main idea of this framework is to utilize the existing constructions of storage codes to jointly design an outer coset code and inner storage code. Finally, we present a construction of an outer coset code over small field size to secure locally repairable codes presented by Tamo and Barg for the special case of an eavesdropper that can observe any subset of nodes of maximum possible size.
Distributed storage systems and caching systems are becoming widespread, and this motivates the increasing interest on assessing their achievable performance in terms of reliability for legitimate users and security against malicious users. While the assessment of reliability takes benefit of the availability of well established metrics and tools, assessing security is more challenging. The classical cryptographic approach aims at estimating the computational effort for an attacker to break the system, and ensuring that it is far above any feasible amount. This has the limitation of depending on attack algorithms and advances in computing power. The information-theoretic approach instead exploits capacity measures to achieve unconditional security against attackers, but often does not provide practical recipes to reach such a condition. We propose a mixed cryptographic/information-theoretic approach with a twofold goal: estimating the levels of information-theoretic security and defining a practical scheme able to achieve them. In order to find optimal choices of the parameters of the proposed scheme, we exploit an effective probabilistic model checker, which allows us to overcome several limitations of more conventional methods.
Software metrics are widely used to measure the quality of software and to give an early indication of the efficiency of the development process in industry. There are many well-established frameworks for measuring the quality of source code through metrics, but limited attention has been paid to the quality of software models. In this article, we evaluate the quality of state machine models specified using the Analytical Software Design (ASD) tooling. We discuss how we applied a number of metrics to ASD models in an industrial setting and report about results and lessons learned while collecting these metrics. Furthermore, we recommend some quality limits for each metric and validate them on models developed in a number of industrial projects.
Industrial Control Systems (ICS) are found in critical infrastructure such as for power generation and water treatment. When security requirements are incorporated into an ICS, one needs to test the additional code and devices added do improve the prevention and detection of cyber attacks. Conducting such tests in legacy systems is a challenge due to the high availability requirement. An approach using Timed Automata (TA) is proposed to overcome this challenge. This approach enables assessment of the effectiveness of an attack detection method based on process invariants. The approach has been demonstrated in a case study on one stage of a 6- stage operational water treatment plant. The model constructed captured the interactions among components in the selected stage. In addition, a set of attacks, attack detection mechanisms, and security specifications were also modeled using TA. These TA models were conjoined into a network and implemented in UPPAAL. The models so implemented were found effective in detecting the attacks considered. The study suggests the use of TA as an effective tool to model an ICS and study its attack detection mechanisms as a complement to doing so in a real plant-operational or under design.
The principal mission of Multi-Source Multicast (MSM) is to disseminate all messages from all sources in a network to all destinations. MSM is utilized in numerous applications. In many of them, securing the messages disseminated is critical. A common secure model is to consider a network where there is an eavesdropper which is able to observe a subset of the network links, and seek a code which keeps the eavesdropper ignorant regarding all the messages. While this is solved when all messages are located at a single source, Secure MSM (SMSM) is an open problem, and the rates required are hard to characterize in general. In this paper, we consider Individual Security, which promises that the eavesdropper has zero mutual information with each message individually. We completely characterize the rate region for SMSM under individual security, and show that such a security level is achievable at the full capacity of the network, that is, the cut-set bound is the matching converse, similar to non-secure MSM. Moreover, we show that the field size is similar to non-secure MSM and does not have to be larger due to the security constraint.
Over the past few years we have articulated theory that describes ‘encrypted computing’, in which data remains in encrypted form while being worked on inside a processor, by virtue of a modified arithmetic. The last two years have seen research and development on a standards-compliant processor that shows that near-conventional speeds are attainable via this approach. Benchmark performance with the US AES-128 flagship encryption and a 1GHz clock is now equivalent to a 433MHz classic Pentium, and most block encryptions fit in AES's place. This summary article details how user data is protected by a system based on the processor from being read or interfered with by the computer operator, for those computing paradigms that entail trust in data-oriented computation in remote locations where it may be accessible to powerful and dishonest insiders. We combine: (i) the processor that runs encrypted; (ii) a slightly modified conventional machine code instruction set architecture with which security is achievable; (iii) an ‘obfuscating’ compiler that takes advantage of its possibilities, forming a three-point system that provably provides cryptographic "semantic security" for user data against the operator and system insiders.
Detecting software security vulnerabilities and distinguishing vulnerable from non-vulnerable code is anything but simple. Most of the time, vulnerabilities remain undisclosed until they are exposed, for instance, by an attack during the software operational phase. Software metrics are widely-used indicators of software quality, but the question is whether they can be used to distinguish vulnerable software units from the non-vulnerable ones during development. In this paper, we perform an exploratory study on software metrics, their interdependency, and their relation with security vulnerabilities. We aim at understanding: i) the correlation between software architectural characteristics, represented in the form of software metrics, and the number of vulnerabilities; and ii) which are the most informative and discriminative metrics that allow identifying vulnerable units of code. To achieve these goals, we use, respectively, correlation coefficients and heuristic search techniques. Our analysis is carried out on a dataset that includes software metrics and reported security vulnerabilities, exposed by security attacks, for all functions, classes, and files of five widely used projects. Results show: i) a strong correlation between several project-level metrics and the number of vulnerabilities, ii) the possibility of using a group of metrics, at both file and function levels, to distinguish vulnerable and non-vulnerable code with a high level of accuracy.
This paper concerns the role of human errors in the field of Early Risk assessment in Software Project Management. Researchers have recently begun to focus on human errors in early risk assessment in large software projects; statistics show it to be major components of problems in software over 80% of economic losses are attributed to this problem. There has been comparatively diminutive experimental research on the role of human errors in this context, particularly evident at the organizational level, largely because of reluctance to share information and statistics on security issues in online software application. Grounded theory has been employed to investigate the main root of human errors in online security risks as a research methodology. An open-ended question was asked of 103 information security experts around the globe and the responses used to develop a list of human errors causes by open coding. The paper represents a contribution to our understanding of the causes of human errors in information security contexts. It is also one of the first information security research studies of the kind utilizing Strauss and Glaser's grounded theory approaches together, during data collection phases to achieve the required number of participants' responses and is a significant contribution to the field.
This paper concerns the role of human errors in the field of Early Risk assessment in Software Project Management. Researchers have recently begun to focus on human errors in early risk assessment in large software projects; statistics show it to be major components of problems in software over 80% of economic losses are attributed to this problem. There has been comparatively diminutive experimental research on the role of human errors in this context, particularly evident at the organizational level, largely because of reluctance to share information and statistics on security issues in online software application. Grounded theory has been employed to investigate the main root of human errors in online security risks as a research methodology. An open-ended question was asked of 103 information security experts around the globe and the responses used to develop a list of human errors causes by open coding. The paper represents a contribution to our understanding of the causes of human errors in information security contexts. It is also one of the first information security research studies of the kind utilizing Strauss and Glaser's grounded theory approaches together, during data collection phases to achieve the required number of participants' responses and is a significant contribution to the field.
We study coding schemes for multiparty interactive communication over synchronous networks that suffer from stochastic noise, where each bit is independently flipped with probability ε. We analyze the minimal overhead that must be added by the coding scheme in order to succeed in performing the computation despite the noise. Our main result is a lower bound on the communication of any noise-resilient protocol over a synchronous star network with n-parties (where all parties communicate in every round). Specifically, we show a task that can be solved by communicating T bits over the noise-free network, but for which any protocol with success probability of 1-o(1) must communicate at least Ω(T log n / log log n) bits when the channels are noisy. By a 1994 result of Rajagopalan and Schulman, the slowdown we prove is the highest one can obtain on any topology, up to a log log n factor. We complete our lower bound with a matching coding scheme that achieves the same overhead; thus, the capacity of (synchronous) star networks is Θ(log log n / log n). Our bounds prove that, despite several previous coding schemes with rate Ω(1) for certain topologies, no coding scheme with constant rate Ω(1) exists for arbitrary n-party noisy networks.
We study coding schemes for multiparty interactive communication over synchronous networks that suffer from stochastic noise, where each bit is independently flipped with probability ε. We analyze the minimal overhead that must be added by the coding scheme in order to succeed in performing the computation despite the noise. Our main result is a lower bound on the communication of any noise-resilient protocol over a synchronous star network with n-parties (where all parties communicate in every round). Specifically, we show a task that can be solved by communicating T bits over the noise-free network, but for which any protocol with success probability of 1-o(1) must communicate at least Ω(T log n / log log n) bits when the channels are noisy. By a 1994 result of Rajagopalan and Schulman, the slowdown we prove is the highest one can obtain on any topology, up to a log log n factor. We complete our lower bound with a matching coding scheme that achieves the same overhead; thus, the capacity of (synchronous) star networks is Θ(log log n / log n). Our bounds prove that, despite several previous coding schemes with rate Ω(1) for certain topologies, no coding scheme with constant rate Ω(1) exists for arbitrary n-party noisy networks.
To help establish a more scientific basis for security science, which will enable the development of fundamental theories and move the field from being primarily reactive to primarily proactive, it is important for research results to be reported in a scientifically rigorous manner. Such reporting will allow for the standard pillars of science, namely replication, meta-analysis, and theory building. In this paper we aim to establish a baseline of the state of scientific work in security through the analysis of indicators of scientific research as reported in the papers from the 2015 IEEE Symposium on Security and Privacy. To conduct this analysis, we developed a series of rubrics to determine the completeness of the papers relative to the type of evaluation used (e.g. case study, experiment, proof). Our findings showed that while papers are generally easy to read, they often do not explicitly document some key information like the research objectives, the process for choosing the cases to include in the studies, and the threats to validity. We hope that this initial analysis will serve as a baseline against which we can measure the advancement of the science of security.
To help establish a more scientific basis for security science, which will enable the development of fundamental theories and move the field from being primarily reactive to primarily proactive, it is important for research results to be reported in a scientifically rigorous manner. Such reporting will allow for the standard pillars of science, namely replication, meta-analysis, and theory building. In this paper we aim to establish a baseline of the state of scientific work in security through the analysis of indicators of scientific research as reported in the papers from the 2015 IEEE Symposium on Security and Privacy. To conduct this analysis, we developed a series of rubrics to determine the completeness of the papers relative to the type of evaluation used (e.g. case study, experiment, proof). Our findings showed that while papers are generally easy to read, they often do not explicitly document some key information like the research objectives, the process for choosing the cases to include in the studies, and the threats to validity. We hope that this initial analysis will serve as a baseline against which we can measure the advancement of the science of security.
Game theory is appropriate for studying cyber conflict because it allows for an intelligent and goal-driven adversary. Applications of game theory have led to a number of results regarding optimal attack and defense strategies. However, the overwhelming majority of applications explore overly simplistic games, often ones in which each participant's actions are visible to every other participant. These simplifications strip away the fundamental properties of real cyber conflicts: probabilistic alerting, hidden actions, unknown opponent capabilities. In this paper, we demonstrate that it is possible to analyze a more realistic game, one in which different resources have different weaknesses, players have different exploits, and moves occur in secrecy, but they can be detected. Certainly, more advanced and complex games are possible, but the game presented here is more realistic than any other game we know of in the scientific literature. While optimal strategies can be found for simpler games using calculus, case-by-case analysis, or, for stochastic games, Q-learning, our more complex game is more naturally analyzed using the same methods used to study other complex games, such as checkers and chess. We define a simple evaluation function and employ multi-step searches to create strategies. We show that such scenarios can be analyzed, and find that in cases of extreme uncertainty, it is often better to ignore one's opponent's possible moves. Furthermore, we show that a simple evaluation function in a complex game can lead to interesting and nuanced strategies that follow tactics that tend to select moves that are well tuned to the details of the situation and the relative probabilities of success.
Game theory is appropriate for studying cyber conflict because it allows for an intelligent and goal-driven adversary. Applications of game theory have led to a number of results regarding optimal attack and defense strategies. However, the overwhelming majority of applications explore overly simplistic games, often ones in which each participant's actions are visible to every other participant. These simplifications strip away the fundamental properties of real cyber conflicts: probabilistic alerting, hidden actions, unknown opponent capabilities. In this paper, we demonstrate that it is possible to analyze a more realistic game, one in which different resources have different weaknesses, players have different exploits, and moves occur in secrecy, but they can be detected. Certainly, more advanced and complex games are possible, but the game presented here is more realistic than any other game we know of in the scientific literature. While optimal strategies can be found for simpler games using calculus, case-by-case analysis, or, for stochastic games, Q-learning, our more complex game is more naturally analyzed using the same methods used to study other complex games, such as checkers and chess. We define a simple evaluation function and employ multi-step searches to create strategies. We show that such scenarios can be analyzed, and find that in cases of extreme uncertainty, it is often better to ignore one's opponent's possible moves. Furthermore, we show that a simple evaluation function in a complex game can lead to interesting and nuanced strategies that follow tactics that tend to select moves that are well tuned to the details of the situation and the relative probabilities of success.
Proofs of Data Possession/Retrievability (PoDP/PoR) schemes are essential to cloud storage services, since they can increase clients' confidence on the integrity and availability of their data. The majority of PoDP/PoR schemes are constructed from homomorphic linear authentication (HLA) schemes, which decrease the price of communication between the client and the server. In this paper, a new subclass of authentication codes, named ε-authentication codes, is proposed, and a modular construction of HLA schemes from ε-authentication codes is presented. We prove that the security notions of HLA schemes are closely related to the size of the authenticator/tag space and the successful probability of impersonation attacks (with non-zero source states) of the underlying ε-authentication codes. We show that most of HLA schemes used for the PoDP/PoR schemes are instantiations of our modular construction from some ε-authentication codes. Following this line, an algebraic-curves-based ε-authentication code yields a new HLA scheme.
Blind signature can be deployed to preserve user anonymity and is widely used in digital cash and e-voting. As an interactive protocol, blind signature schemes require high efficiency. In this paper, we propose a code-based blind signature scheme with high efficiency as it can produce a valid signature without many loops unlike existing code-based signature schemes. We then prove the security of our scheme in the random oracle model and analyze the efficiency of our scheme. Since a code-based signature scheme is post-quantum cryptography, therefore, the scheme is also able to resist quantum attacks.
Blind signature can be deployed to preserve user anonymity and is widely used in digital cash and e-voting. As an interactive protocol, blind signature schemes require high efficiency. In this paper, we propose a code-based blind signature scheme with high efficiency as it can produce a valid signature without many loops unlike existing code-based signature schemes. We then prove the security of our scheme in the random oracle model and analyze the efficiency of our scheme. Since a code-based signature scheme is post-quantum cryptography, therefore, the scheme is also able to resist quantum attacks.
It is a fundamental problem to decide how many copies of an unknown mixed quantum state are necessary and sufficient to determine the state. This is the quantum analogue of the problem of estimating a probability distribution given some number of samples. Previously, it was known only that estimating states to error є in trace distance required O(dr2/є2) copies for a d-dimensional density matrix of rank r. Here, we give a measurement scheme (POVM) that uses O( (dr/ δ ) ln(d/δ) ) copies to estimate ρ to error δ in infidelity. This implies O( (dr / є2)· ln(d/є) ) copies suffice to achieve error є in trace distance. For fixed d, our measurement can be implemented on a quantum computer in time polynomial in n. We also use the Holevo bound from quantum information theory to prove a lower bound of Ω(dr/є2)/ log(d/rє) copies needed to achieve error є in trace distance. This implies a lower bound Ω(dr/δ)/log(d/rδ) for the estimation error δ in infidelity. These match our upper bounds up to log factors. Our techniques can also show an Ω(r2d/δ) lower bound for measurement strategies in which each copy is measured individually and then the outcomes are classically post-processed to produce an estimate. This matches the known achievability results and proves for the first time that such “product” measurements have asymptotically suboptimal scaling with d and r.
It is a fundamental problem to decide how many copies of an unknown mixed quantum state are necessary and sufficient to determine the state. This is the quantum analogue of the problem of estimating a probability distribution given some number of samples. Previously, it was known only that estimating states to error є in trace distance required O(dr2/є2) copies for a d-dimensional density matrix of rank r. Here, we give a measurement scheme (POVM) that uses O( (dr/ δ ) ln(d/δ) ) copies to estimate ρ to error δ in infidelity. This implies O( (dr / є2)· ln(d/є) ) copies suffice to achieve error є in trace distance. For fixed d, our measurement can be implemented on a quantum computer in time polynomial in n. We also use the Holevo bound from quantum information theory to prove a lower bound of Ω(dr/є2)/ log(d/rє) copies needed to achieve error є in trace distance. This implies a lower bound Ω(dr/δ)/log(d/rδ) for the estimation error δ in infidelity. These match our upper bounds up to log factors. Our techniques can also show an Ω(r2d/δ) lower bound for measurement strategies in which each copy is measured individually and then the outcomes are classically post-processed to produce an estimate. This matches the known achievability results and proves for the first time that such “product” measurements have asymptotically suboptimal scaling with d and r.
Physical layer security can ensure secure communication over noisy channels in the presence of an eavesdropper with unlimited computational power. We adopt an information theoretic variant of semantic-security (SS) (a cryptographic gold standard), as our secrecy metric and study the open problem of the type II wiretap channel (WTC II) with a noisy main channel is, whose secrecy-capacity is unknown even under looser metrics than SS. Herein the secrecy-capacity is derived and shown to be equal to its SS capacity. In this setting, the legitimate users communicate via a discrete-memory less (DM) channel in the presence of an eavesdropper that has perfect access to a subset of its choosing of the transmitted symbols, constrained to a fixed fraction of the block length. The secrecy criterion is achieved simultaneously for all possible eavesdropper subset choices. On top of that, SS requires negligible mutual information between the message and the eavesdropper's observations even when maximized over all message distributions. A key tool for the achievability proof is a novel and stronger version of Wyner's soft covering lemma. Specifically, the lemma shows that a random codebook achieves the soft-covering phenomenon with high probability. The probability of failure is doubly-exponentially small in the block length. Since the combined number of messages and subsets grows only exponentially with the block length, SS for the WTC II is established by using the union bound and invoking the stronger soft-covering lemma. The direct proof shows that rates up to the weak-secrecy capacity of the classic WTC with a DM erasure channel (EC) to the eavesdropper are achievable. The converse follows by establishing the capacity of this DM wiretap EC as an upper bound for the WTC II. From a broader perspective, the stronger soft-covering lemma constitutes a tool for showing the existence of codebooks that satisfy exponentially many constraints, a beneficial ability for many other applications in information theoretic security.