Visible to the public Biblio

Filters: Keyword is resilient computing  [Clear All Filters]
2021-12-20
Silva, Douglas Simões, Graczyk, Rafal, Decouchant, Jérémie, Völp, Marcus, Esteves-Verissimo, Paulo.  2021.  Threat Adaptive Byzantine Fault Tolerant State-Machine Replication. 2021 40th International Symposium on Reliable Distributed Systems (SRDS). :78–87.
Critical infrastructures have to withstand advanced and persistent threats, which can be addressed using Byzantine fault tolerant state-machine replication (BFT-SMR). In practice, unattended cyberdefense systems rely on threat level detectors that synchronously inform them of changing threat levels. However, to have a BFT-SMR protocol operate unattended, the state-of-the-art is still to configure them to withstand the highest possible number of faulty replicas \$f\$ they might encounter, which limits their performance, or to make the strong assumption that a trusted external reconfiguration service is available, which introduces a single point of failure. In this work, we present ThreatAdaptive the first BFT-SMR protocol that is automatically strengthened or optimized by its replicas in reaction to threat level changes. We first determine under which conditions replicas can safely reconfigure a BFT-SMR system, i.e., adapt the number of replicas \$n\$ and the fault threshold \$f\$ so as to outpace an adversary. Since replicas typically communicate with each other using an asynchronous network they cannot rely on consensus to decide how the system should be reconfigured. ThreatAdaptive avoids this pitfall by proactively preparing the reconfiguration that may be triggered by an increasing threat when it optimizes its performance. Our evaluation shows that ThreatAdaptive can meet the latency and throughput of BFT baselines configured statically for a particular level of threat, and adapt 30% faster than previous methods, which make stronger assumptions to provide safety.
2020-10-06
Drozd, Oleksandr, Kharchenko, Vyacheslav, Rucinski, Andrzej, Kochanski, Thaddeus, Garbos, Raymond, Maevsky, Dmitry.  2019.  Development of Models in Resilient Computing. 2019 10th International Conference on Dependable Systems, Services and Technologies (DESSERT). :1—6.

The article analyzes the concept of "Resilience" in relation to the development of computing. The strategy for reacting to perturbations in this process can be based either on "harsh Resistance" or "smarter Elasticity." Our "Models" are descriptive in defining the path of evolutionary development as structuring under the perturbations of the natural order and enable the analysis of the relationship among models, structures and factors of evolution. Among those, two features are critical: parallelism and "fuzziness", which to a large extent determine the rate of change of computing development, especially in critical applications. Both reversible and irreversible development processes related to elastic and resistant methods of problem solving are discussed. The sources of perturbations are located in vicinity of the resource boundaries, related to growing problem size with progress combined with the lack of computational "checkability" of resources i.e. data with inadequate models, methodologies and means. As a case study, the problem of hidden faults caused by the growth, the deficit of resources, and the checkability of digital circuits in critical applications is analyzed.

2019-03-06
Fargo, F., Sury, S..  2018.  Autonomic Secure HPC Fabric Architecture. 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA). :1-4.

Cloud computing is the major paradigm in today's IT world with the capabilities of security management, high performance, flexibility, scalability. Customers valuing these features can better benefit if they use a cloud environment built using HPC fabric architecture. However, security is still a major concern, not only on the software side but also on the hardware side. There are multiple studies showing that the malicious users can affect the regular customers through the hardware if they are co-located on the same physical system. Therefore, solving possible security concerns on the HPC fabric architecture will clearly make the fabric industries leader in this area. In this paper, we propose an autonomic HPC fabric architecture that leverages both resilient computing capabilities and adaptive anomaly analysis for further security.