Biblio
Two-phase I/O is a well-known strategy for implementing collective MPI-IO functions. It redistributes I/O requests among the calling processes into a form that minimizes the file access costs. As modern parallel computers continue to grow into the exascale era, the communication cost of such request redistribution can quickly overwhelm collective I/O performance. This effect has been observed from parallel jobs that run on multiple compute nodes with a high count of MPI processes on each node. To reduce the communication cost, we present a new design for collective I/O by adding an extra communication layer that performs request aggregation among processes within the same compute nodes. This approach can significantly reduce inter-node communication contention when redistributing the I/O requests. We evaluate the performance and compare it with the original two-phase I/O on Cray XC40 parallel computers (Theta and Cori) with Intel KNL and Haswell processors. Using I/O patterns from two large-scale production applications and an I/O benchmark, we show our proposed method effectively reduces the communication cost and hence maintains the scalability for a large number of processes.
{Unikernel is smaller in size than existing operating systems and can be started and shut down much more quickly and safely, resulting in greater flexibility and security. Since unikernel does not include large modules like the file system in its library to reduce its size, it is common to choose offloading to handle file IO. However, the processing of IO offload of unikernel transfers the file IO command to the proxy of the file server and copies the file IO result of the proxy. This can result in a trade-off of rapid processing, an advantage of unikernel. In this paper, we propose a method to offload file IO and to perform file IO with direct copy from file server to unikernel}.
Cyber-physical systems (CPSs) are implemented in many industrial and embedded control applications. Where these systems are safety-critical, correct and safe behavior is of paramount importance. Malicious attacks on such CPSs can have far-reaching repercussions. For instance, if elements of a power grid behave erratically, physical damage and loss of life could occur. Currently, there is a trend toward increased complexity and connectivity of CPS. However, as this occurs, the potential attack vectors for these systems grow in number, increasing the risk that a given controller might become compromised. In this article, we examine how the dangers of compromised controllers can be mitigated. We propose a novel application of runtime enforcement that can secure the safety of real-world physical systems. Here, we synthesize enforcers to a new hardware architecture within programmable logic controller I/O modules to act as an effective line of defence between the cyber and the physical domains. Our enforcers prevent the physical damage that a compromised control system might be able to perform. To demonstrate the efficacy of our approach, we present several benchmarks, and show that the overhead for each system is extremely minimal.
The Rowhammer bug is a novel micro-architectural security threat, enabling powerful privilege-escalation attacks on various mainstream platforms. It works by actively flipping bits in Dynamic Random Access Memory (DRAM) cells with unprivileged instructions. In order to set up Rowhammer against binaries in the Linux page cache, the Waylaying algorithm has previously been proposed. The Waylaying method stealthily relocates binaries onto exploitable physical addresses without exhausting system memory. However, the proof-of-concept Waylaying algorithm can be easily detected during page cache eviction because of its high disk I/O overhead and long running time. This paper proposes the more advanced Memway algorithm, which improves on Waylaying in terms of both I/O overhead and speed. Running time and disk I/O overhead are reduced by 90% by utilizing Linux tmpfs and inmemory swapping to manage eviction files. Furthermore, by combining Memway with the unprivileged posix fadvise API, the binary relocation step is made 100 times faster. Equipped with our Memway+fadvise relocation scheme, we demonstrate practical Rowhammer attacks that take only 15-200 minutes to covertly relocate a victim binary, and less than 3 seconds to flip the target instruction bit.
Although RAID is a well-known technique to protect data against disk errors, it is vulnerable to silent data corruptions that cannot be detected by disk drives. Existing integrity protection schemes designed for RAID arrays often introduce high I/O overhead. Our key insight is that by properly designing an integrity protection scheme that adapts to the read/write characteristics of storage workloads, the I/O overhead can be significantly mitigated. In view of this, this paper presents a systematic study on I/O-efficient integrity protection against silent data corruptions in RAID arrays. We formalize an integrity checking model, and justify that a large proportion of disk reads can be checked with simpler and more I/O-efficient integrity checking mechanisms. Based on this integrity checking model, we construct two integrity protection schemes that provide complementary performance advantages for storage workloads with different user write sizes. We further propose a quantitative method for choosing between the two schemes in real-world scenarios. Our trace-driven simulation results show that with the appropriate integrity protection scheme, we can reduce the I/O overhead to below 15%.