Biblio
The analysis of applied tasks and methods of entropy signal processing are carried out in this article. The theoretical comments about the specific schemes of special processors for the determination of probability and correlation activity are given. The perspective of the influence of probabilistic entropy of C. Shannon as cipher signal receivers is reviewed. Examples of entropy-manipulated signals and system characteristics of the proposed special processors are given.
A systematic study of technologies and concepts used for the design and construction of distributed fail-safe web systems has been conducted. The general principles of the design of distributed web-systems and information technologies that are used in the design of web-systems are considered. As a result of scientific research, it became clear that data backup is a determining attribute of most web systems serving. Thus, the main role in building modern web systems is to scaling them. Scaling in distributed systems is used when performing a particular operation requires a large amount of computing resources. There are two scaling options, namely vertical and horizontal. Vertical scaling is to increase the performance of existing components in order to increase overall productivity. However, for the construction of distributed systems, use horizontal scaling. Horizontal scaling is that the system is split into small components and placed on various physical computers. This approach allows the addition of new nodes to increase the productivity of the web system as a whole.
Correct compilers perform program transformations preserving input/output behaviours of programs. Yet, correctness does not prevent program optimisations from introducing information-flow leaks that would make the target program more vulnerable to side-channel attacks than the source program. To tackle this problem, we propose a notion of Information-Flow Preserving (IFP) program transformation which ensures that a target program is no more vulnerable to passive side-channel attacks than a source program. To protect against a wide range of attacks, we model an attacker who is granted arbitrary memory accesses for a pre-defined set of observation points. We propose a compositional proof principle for proving that a transformation is IFP. Using this principle, we show how a translation validation technique can be used to automatically verify and even close information-flow leaks introduced by standard compiler passes such as dead-store elimination and register allocation. The technique has been experimentally validated on the CompCert C compiler.
Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try to guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful in how it executes, can access the victim's memory and registers, and can perform operations with measurable side effects. Spectre attacks involve inducing a victim to speculatively perform operations that would not occur during correct program execution and which leak the victim's confidential information via a side channel to the adversary. This paper describes practical attacks that combine methodology from side channel attacks, fault attacks, and return-oriented programming that can read arbitrary memory from the victim's process. More broadly, the paper shows that speculative execution implementations violate the security assumptions underpinning numerous software security mechanisms, including operating system process separation, containerization, just-in-time (JIT) compilation, and countermeasures to cache timing and side-channel attacks. These attacks represent a serious threat to actual systems since vulnerable speculative execution capabilities are found in microprocessors from Intel, AMD, and ARM that are used in billions of devices. While makeshift processor-specific countermeasures are possible in some cases, sound solutions will require fixes to processor designs as well as updates to instruction set architectures (ISAs) to give hardware architects and software developers a common understanding as to what computation state CPU implementations are (and are not) permitted to leak.
This paper presents a multilayer protection approach to guard programs against Return-Oriented Programming (ROP) attacks. Upper layers validate most of a program's control flow at a low computational cost; thus, not compromising runtime. Lower layers provide strong enforcement guarantees to handle more suspicious flows; thus, enhancing security. Our multilayer system combines techniques already described in the literature with verifications that we introduce in this paper. We argue that modern versions of x86 processors already provide the microarchitectural units necessary to implement our technique. We demonstrate the effectiveness of our multilayer protection on a extensive suite of benchmarks, which includes: SPEC CPU2006; the three most popular web browsers; 209 benchmarks distributed with LLVM and four well-known systems shown to be vulnerable to ROP exploits. Our experiments indicate that we can protect programs with almost no overhead in practice, allying the good performance of lightweight security techniques with the high dependability of heavyweight approaches.
Over the past decade, the reliance on Unmanned Aerial Systems (UAS) to carry out critical missions has grown drastically. With an increased reliance on UAS as mission assets and the dependency of UAS on cyber resources, cyber security of UAS must be improved by adopting sound security principles and relevant technologies from the computing community. On the other hand, the traditional avionics community, being aware of the importance of cyber security, is looking at new architecture and designs that can accommodate both the traditional safety oriented principles as well as the cyber security principles and techniques. It is with the effective and timely convergence of these domains that a holistic approach and co-design can meet the unique requirements of modern systems and operations. In this paper, authors from both the cyber security and avionics domains describe our joint effort and insights obtained during the course of designing secure and resilient embedded avionics systems.
Unlike traditional processors, embedded Internet of Things (IoT) devices lack resources to incorporate protection against modern sophisticated attacks resulting in critical consequences. Remote attestation (RA) is a security service to establish trust in the integrity of a remote device. While conventional RA is static and limited to detecting malicious modification to software binaries at load-time, recent research has made progress towards runtime attestation, such as attesting the control flow of an executing program. However, existing control-flow attestation schemes are inefficient and vulnerable to sophisticated data-oriented programming (DOP) attacks subvert these schemes and keep the control flow of the code intact. In this paper, we present LiteHAX, an efficient hardware-assisted remote attestation scheme for RISC-based embedded devices that enables detecting both control-flow attacks as well as DOP attacks. LiteHAX continuously tracks both the control-flow and data-flow events of a program executing on a remote device and reports them to a trusted verifying party. We implemented and evaluated LiteHAX on a RISC-V System-on-Chip (SoC) and show that it has minimal performance and area overhead.
This paper describes the work done to design a SoC platform for real-time on-line pattern search in TCP packets for Deep Packet Inspection (DPI) applications. The platform is based on a Xilinx Zynq programmable SoC and includes an accelerator that implements a pattern search engine that extends the original Boyer-Moore algorithm with timing and logical rules, that produces a very complex set of rules. Also, the platform implements different modes of operation, including SIMD and MISD parallelism, which can be configured on-line. The platform is scalable depending of the analysis requirement up to 8 Gbps. High-Level synthesis and platform based design methodologies have been used to reduce the time to market of the completed system.
This paper presents a 28nm SoC with a programmable FC-DNN accelerator design that demonstrates: (1) HW support to exploit data sparsity by eliding unnecessary computations (4× energy reduction); (2) improved algorithmic error tolerance using sign-magnitude number format for weights and datapath computation; (3) improved circuit-level timing violation tolerance in datapath logic via timeborrowing; (4) combined circuit and algorithmic resilience with Razor timing violation detection to reduce energy via VDD scaling or increase throughput via FCLK scaling; and (5) high classification accuracy (98.36% for MNIST test set) while tolerating aggregate timing violation rates \textbackslashtextgreater10-1. The accelerator achieves a minimum energy of 0.36μJ/pred at 667MHz, maximum throughput at 1.2GHz and 0.57μJ/pred, or a 10%-margined operating point at 1GHz and 0.58μJ/pred.
Cooperation of software and hardware with hybrid architectures, such as Xilinx Zynq SoC combining ARM CPU and FPGA fabric, is a high-performance and low-power platform for accelerating RSA Algorithm. This paper adopts the none-subtraction Montgomery algorithm and the Chinese Remainder Theorem (CRT) to implement high-speed RSA processors, and deploys a 48-node cluster infrastructure based on Zynq SoC to achieve extremely high scalability and throughput of RSA computing. In this design, we use the ARM to implement node-to-node communication with the Message Passing Interface (MPI) while use the FPGA to handle complex calculation. Finally, the experimental results show that the overall performance is linear with the number of nodes. And the cluster achieves 6× 9× speedup against a multi-core desktop (Intel i7-3770) and comparable performance to a many-core server (288-core). In addition, we gain up to 2.5× energy efficiency compared to these two traditional platforms.
The emergence of general-purpose system-on-chip (SoC) architectures has given rise to a number of significant security challenges. The current trend in SoC design is system-level integration of heterogeneous technologies consisting of a large number of processing elements such as programmable RISC cores, memory, DSPs, and accelerator function units/ASIC. These processing elements may come from different providers, and application executable code may have varying levels of trust. Some of the pressing architecture design questions are: (1) how to implement multi-level user-defined security; (2) how to optimally and securely share resources and data among processing elements. In this work, we develop a secure multicore architecture, named Hermes. It represents a new architectural framework that integrates multiple processing elements (called tenants) of secure and non-secure cores into the same chip design while (a) maintaining individual tenant security, (b) preventing data leakage and corruption, and (c) promoting collaboration among the tenants. The Hermes architecture is based on a programmable secure router interface and a trust-aware routing algorithm. With 17% hardware overhead, it enables the implementation of processing-element-oblivious secure multicore systems with a programmable distributed group key management scheme.
In many-core systems, the processing elements are interconnected using Networks-on-Chip. An example of on-chip network is SoCIN, a low-cost interconnect architecture whose original design did not take into account security aspects. This network is vulnerable to eavesdropping and spoofing attacks, what limits its use in systems that require security. This work addresses this issue and aims to ensure the security properties of confidentiality and authenticity of SoCIN-based systems. For this, we propose the use of security mechanisms based on symmetric encryption at the network level using the AES (Advanced Encryption Standard) model. A reference multi-core platform was implemented and prototyped in programmable logic aiming at performing experiments to evaluate the implemented mechanisms. Results demonstrate the effectiveness of the proposed solution in protecting the system against the target attacks. The impact on the network performance is acceptable and the silicon overhead is equivalent to other solutions found in the literature.
Emerging nonvolatile memory (NVM) devices are not limited to build nonvolatile memory macros. They can also be used in developing nonvolatile logics (nvLogics) for nonvolatile processors, security circuits for the internet of things (IoT), and computing-in-memory (CIM) for artificial intelligence (AI) chips. This paper explores the challenges in circuit designs of emerging memory devices for application in nonvolatile logics, security circuits, and CIM for deep neural networks (DNN). Several silicon-verified examples of these circuits are reviewed in this paper.
The area of secure compilation aims to design compilers which produce hardened code that can withstand attacks from low-level co-linked components. So far, there is no formal correctness criterion for secure compilers that comes with a clear understanding of what security properties the criterion actually provides. Ideally, we would like a criterion that, if fulfilled by a compiler, guarantees that large classes of security properties of source language programs continue to hold in the compiled program, even as the compiled program is run against adversaries with low-level attack capabilities. This paper provides such a novel correctness criterion for secure compilers, called trace-preserving compilation (TPC). We show that TPC preserves a large class of security properties, namely all safety hyperproperties. Further, we show that TPC preserves more properties than full abstraction, the de-facto criterion used for secure compilation. Then, we show that several fully abstract compilers described in literature satisfy an additional, common property, which implies that they also satisfy TPC. As an illustration, we prove that a fully abstract compiler from a typed source language to an untyped target language satisfies TPC.
Embedded systems are becoming increasingly complex as designers integrate different functionalities into a single application for execution on heterogeneous hardware platforms. In this work we propose a system-level security approach in order to provide isolation of tasks without the need to trust a central authority at run-time. We discuss security requirements that can be found in complex embedded systems that use heterogeneous execution platforms, and by regulating memory access we create mechanisms that allow safe use of shared IP with direct memory access, as well as shared libraries. We also present a prototype Isolation Unit that checks memory transactions and allows for dynamic configuration of permissions.
This paper briefly presents a position that hardware-based roots of trust, integrated in silicon with System-on-Chip (SoC) solutions, represent the most current stage in a progression of technologies aimed at realizing the most foundational computer security concepts. A brief look at this historical progression from a personal perspective is followed by an overview of more recent developments, with particular focus on a root of trust for cryptographic key provisioning and SoC feature management aimed at achieving supply chain assurances and serves as a basis for trust that is linked to properties enforced in hardware. The author assumes no prior knowledge of these concepts and developments by the reader.
Hashing algorithms are used extensively in information security and digital forensics applications. This paper presents an efficient parallel algorithm hash computation. It's a modification of the SHA-1 algorithm for faster parallel implementation in applications such as the digital signature and data preservation in digital forensics. The algorithm implements recursive hash to break the chain dependencies of the standard hash function. We discuss the theoretical foundation for the work including the collision probability and the performance implications. The algorithm is implemented using the OpenMP API and experiments performed using machines with multicore processors. The results show a performance gain by more than a factor of 3 when running on the 8-core configuration of the machine.
Hashing algorithms are used extensively in information security and digital forensics applications. This paper presents an efficient parallel algorithm hash computation. It's a modification of the SHA-1 algorithm for faster parallel implementation in applications such as the digital signature and data preservation in digital forensics. The algorithm implements recursive hash to break the chain dependencies of the standard hash function. We discuss the theoretical foundation for the work including the collision probability and the performance implications. The algorithm is implemented using the OpenMP API and experiments performed using machines with multicore processors. The results show a performance gain by more than a factor of 3 when running on the 8-core configuration of the machine.
Optimizing memory access is critical for performance and power efficiency. CPU manufacturers have developed sampling-based performance measurement units (PMUs) that report precise costs of memory accesses at specific addresses. However, this data is too low-level to be meaningfully interpreted and contains an excessive amount of irrelevant or uninteresting information. We have developed a method to gather fine-grained memory access performance data for specific data objects and regions of code with low overhead and attribute semantic information to the sampled memory accesses. This information provides the context necessary to more effectively interpret the data. We have developed a tool that performs this sampling and attribution and used the tool to discover and diagnose performance problems in real-world applications. Our techniques provide useful insight into the memory behaviour of applications and allow programmers to understand the performance ramifications of key design decisions: domain decomposition, multi-threading, and data motion within distributed memory systems.