Visible to the public Biblio

Filters: Keyword is Fault tolerant systems  [Clear All Filters]
2023-05-26
Basan, Elena, Mikhailova, Vasilisa, Shulika, Maria.  2022.  Exploring Security Testing Methods for Cyber-Physical Systems. 2022 International Siberian Conference on Control and Communications (SIBCON). :1—7.
A methodology for studying the level of security for various types of CPS through the analysis of the consequences was developed during the research process. An analysis of the architecture of cyber-physical systems was carried out, vulnerabilities and threats of specific devices were identified, a list of possible information attacks and their consequences after the exploitation of vulnerabilities was identified. The object of research is models of cyber-physical systems, including IoT devices, microcomputers, various sensors that function through communication channels, organized by cyber-physical objects. The main subjects of this investigation are methods and means of security testing of cyber-physical systems (CPS). The main objective of this investigation is to update the problem of security in cyber-physical systems, to analyze the security of these systems. In practice, the testing methodology for the cyber-physical system “Smart Factory” was implemented, which simulates the operation of a real CPS, with different types of links and protocols used.
2023-04-28
Mohammadi, Neda, Rasoolzadegan, Abbas.  2022.  A Pattern-aware Design and Implementation Guideline for Microservice-based Systems. 2022 27th International Computer Conference, Computer Society of Iran (CSICC). :1–6.
Nowadays, microservice architecture is known as a successful and promising architecture for smart city applications. Applying microservices in the designing and implementation of systems has many advantages such as autonomy, loosely coupled, composability, scalability, fault tolerance. However, the complexity of calling between microservices leads to problems in security, accessibility, and data management in the execution of systems. In order to address these challenges, in recent years, various researchers and developers have focused on the use of microservice patterns in the implementation of microservice-based systems. Microservice patterns are the result of developers’ successful experiences in addressing common challenges in microservicebased systems. However, hitherto no guideline has been provided for an in-depth understanding of microservice patterns and how to apply them to real systems. The purpose of this paper is to investigate in detail the most widely used and important microservice patterns in order to analyze the function of each pattern, extract the behavioral signatures and construct a service dependency graph for them so that researchers and enthusiasts use the provided guideline to create a microservice-based system equipped with design patterns. To construct the proposed guideline, five real open source projects have been carefully investigated and analyzed and the results obtained have been used in the process of making the guideline.
2023-03-31
Yuan, Dandan, Cui, Shujie, Russello, Giovanni.  2022.  We Can Make Mistakes: Fault-tolerant Forward Private Verifiable Dynamic Searchable Symmetric Encryption. 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P). :587–605.
Verifiable Dynamic Searchable Symmetric Encryption (VDSSE) enables users to securely outsource databases (document sets) to cloud servers and perform searches and updates. The verifiability property prevents users from accepting incorrect search results returned by a malicious server. However, we discover that the community currently only focuses on preventing malicious behavior from the server but ignores incorrect updates from the client, which are very likely to happen since there is no record on the client to check. Indeed most existing VDSSE schemes are not sufficient to tolerate incorrect updates from the client. For instance, deleting a nonexistent keyword-identifier pair can break their correctness and soundness. In this paper, we demonstrate the vulnerabilities of a type of existing VDSSE schemes that fail them to ensure correctness and soundness properties on incorrect updates. We propose an efficient fault-tolerant solution that can consider any DSSE scheme as a black-box and make them into a fault-tolerant VDSSE in the malicious model. Forward privacy is an important property of DSSE that prevents the server from linking an update operation to previous search queries. Our approach can also make any forward secure DSSE scheme into a fault-tolerant VDSSE without breaking the forward security guarantee. In this work, we take FAST [1] (TDSC 2020), a forward secure DSSE, as an example, implement a prototype of our solution, and evaluate its performance. Even when compared with the previous fastest forward private construction that does not support fault tolerance, the experiments show that our construction saves 9× client storage and has better search and update efficiency.
2023-02-03
Wang, Yingsen, Li, Yixiao, Zhao, Juanjuan, Wang, Guibin, Jiao, Weihan, Qiang, Yan, Li, Keqin.  2022.  A Fast and Secured Peer-to-Peer Energy Trading Using Blockchain Consensus. 2022 IEEE Industry Applications Society Annual Meeting (IAS). :1–8.
The architecture and functioning of the electricity markets are rapidly evolving in favour of solutions based on real-time data sharing and decentralised, distributed, renewable energy generation. Peer-to-peer (P2P) energy markets allow two individuals to transact with one another without the need of intermediaries, reducing the load on the power grid during peak hours. However, such a P2P energy market is prone to various cyber attacks. Blockchain technology has been proposed to implement P2P energy trading to support this change. One of the most crucial components of blockchain technology in energy trading is the consensus mechanism. It determines the effectiveness and security of the blockchain for energy trading. However, most of the consensus used in energy trading today are traditional consensus such as Proof-of-Work (PoW) and Practical Byzantine Fault Tolerance (PBFT). These traditional mechanisms cannot be directly adopted in P2P energy trading due to their huge computational power, low throughput, and high latency. Therefore, we propose the Block Alliance Consensus (BAC) mechanism based on Hashgraph. In a massive P2P energy trading network, BAC can keep Hashgraph's throughput while resisting Sybil attacks and supporting the addition and deletion of energy participants. The high efficiency and security of BAC and the blockchain-based energy trading platform are verified through experiments: our improved BAC has an average throughput that is 2.56 times more than regular BFT, 5 times greater than PoW, and 30% greater than the original BAC. The improved BAC has an average latency that is 41% less than BAC and 81% less than original BFT. Our energy trading blockchain (ETB)'s READ performance can achieve the most outstanding throughput of 1192 tps at a workload of 1200 tps, while WRITE can achieve 682 tps at a workload of 800 tps with a success rate of 95% and 0.18 seconds of latency.
ISSN: 2576-702X
2023-01-20
Boni, Mounika, Ch, Tharakeswari, Alamanda, Swathi, Arasada, Bhaskara Venkata Sai Gayath, Maria, Azees.  2022.  An Efficient and Secure Anonymous Authentication Scheme for V2G Networks. 2022 6th International Conference on Devices, Circuits and Systems (ICDCS). :432—436.

The vehicle-to-grid (V2G) network has a clear advantage in terms of economic benefits, and it has grabbed the interest of powergrid and electric vehicle (EV) consumers. Many V2G techniques, at present, for example, use bilinear pairing to execute the authentication scheme, which results in significant computational costs. Furthermore, in the existing V2G techniques, the system master key is issued independently by the third parties, it is vulnerable to leaking if the third party is compromised by an attacker. This paper presents an efficient and secure anonymous authentication scheme for V2G networks to overcome this issue we use a lightweight authentication system for electric vehicles and smart grids. In the proposed technique, the keys are generated by the trusted authority after the successful registration of EVs in the trusted authority and the dispatching center. The suggested scheme not only enhances the verification performance of V2G networks and also protects against inbuilt hackers.

Himdi, Tarik, Ishaque, Mohammed, Ikram, Muhammed Jawad.  2022.  Cyber Security Challenges in Distributed Energy Resources for Smart Cities. 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom). :788—792.

With the proliferation of data in Internet-related applications, incidences of cyber security have increased manyfold. Energy management, which is one of the smart city layers, has also been experiencing cyberattacks. Furthermore, the Distributed Energy Resources (DER), which depend on different controllers to provide energy to the main physical smart grid of a smart city, is prone to cyberattacks. The increased cyber-attacks on DER systems are mainly because of its dependency on digital communication and controls as there is an increase in the number of devices owned and controlled by consumers and third parties. This paper analyzes the major cyber security and privacy challenges that might inflict, damage or compromise the DER and related controllers in smart cities. These challenges highlight that the security and privacy on the Internet of Things (IoT), big data, artificial intelligence, and smart grid, which are the building blocks of a smart city, must be addressed in the DER sector. It is observed that the security and privacy challenges in smart cities can be solved through the distributed framework, by identifying and classifying stakeholders, using appropriate model, and by incorporating fault-tolerance techniques.

2023-01-06
Alkoudsi, Mohammad Ibrahim, Fohler, Gerhard, Völp, Marcus.  2022.  Tolerating Resource Exhaustion Attacks in the Time-Triggered Architecture. 2022 XII Brazilian Symposium on Computing Systems Engineering (SBESC). :1—8.
The Time-Triggered Architecture (TTA) presents a blueprint for building safe and real-time constrained distributed systems, based on a set of orthogonal concepts that make extensive use of the availability of a globally consistent notion of time and a priori knowledge of events. Although the TTA tolerates arbitrary failures of any of its nodes by architectural means (active node replication, a membership service, and bus guardians), the design of these means considers only accidental faults. However, distributed safety- and real-time critical systems have been emerging into more open and interconnected systems, operating autonomously for prolonged times and interfacing with other possibly non-real-time systems. Therefore, the existence of vulnerabilities that adversaries may exploit to compromise system safety cannot be ruled out. In this paper, we discuss potential targeted attacks capable of bypassing TTA's fault-tolerance mechanisms and demonstrate how two well-known recovery techniques - proactive and reactive rejuvenation - can be incorporated into TTA to reduce the window of vulnerability for attacks without introducing extensive and costly changes.
Bogatyrev, Vladimir A., Bogatyrev, Stanislav V., Bogatyrev, Anatoly V..  2022.  Reliability and Timeliness of Servicing Requests in Infocommunication Systems, Taking into Account the Physical and Information Recovery of Redundant Storage Devices. 2022 International Conference on Information, Control, and Communication Technologies (ICCT). :1—4.
Markov models of reliability of fault-tolerant computer systems are proposed, taking into account two stages of recovery of redundant memory devices. At the first stage, the physical recovery of memory devices is implemented, and at the second, the informational one consists in entering the data necessary to perform the required functions. Memory redundancy is carried out to increase the stability of the system to the loss of unique data generated during the operation of the system. Data replication is implemented in all functional memory devices. Information recovery is carried out using replicas of data stored in working memory devices. The model takes into account the criticality of the system to the timeliness of calculations in real time and to the impossibility of restoring information after multiple memory failures, leading to the loss of all stored replicas of unique data. The system readiness coefficient and the probability of its transition to a non-recoverable state are determined. The readiness of the system for the timely execution of requests is evaluated, taking into account the influence of the shares of the distribution of the performance of the computer allocated for the maintenance of requests and for the entry of information into memory after its physical recovery.
2022-12-09
Liu, Chun, Shi, Yue.  2022.  Anti-attack Fault-tolerant Control of Multi-agent Systems with Complicated Actuator Faults and Cyber Attacks. 2022 5th International Symposium on Autonomous Systems (ISAS). :1—5.
This study addresses the coordination issue of multi-agent systems under complicated actuator faults and cyber attacks. Distributed fault-tolerant design is developed with the estimated and output neighboring information in decentralized estimation observer. Criteria of reaching the exponential coordination of multi-agent systems with cyber attacks is obtained with average dwelling time and chattering bound method. Simulations validate the efficiency of the anti-attack fault-tolerant design.
2022-12-06
Han, May Pyone, Htet, Soe Ye, Wuttisttikulkij, Lunchakorn.  2022.  Hybrid GNS3 and Mininet-WiFi Emulator for SDN Backbone Network Supporting Wireless IoT Traffic. 2022 37th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC). :768-771.

In the IoT (Internet of Things) domain, it is still a challenge to modify the routing behavior of IoT traffic at the decentralized backbone network. In this paper, centralized and flexible software-defined networking (SDN) is utilized to route the IoT traffic. The management of IoT data transmission through the SDN core network gives the chance to choose the path with the lowest delay, minimum packet loss, or hops. Therefore, fault-tolerant delay awareness routing is proposed for the emulated SDN-based backbone network to handle delay-sensitive IoT traffic. Besides, the hybrid form of GNS3 and Mininet-WiFi emulation is introduced to collaborate the SDN-based backbone network in GNS3 and the 6LoWPAN (IPv6 over Low Power Personal Area Network) sensor network in Mininet-WiFi.

2022-11-18
Spyrou, Theofilos, El-Sayed, Sarah A., Afacan, Engin, Camuñas-Mesa, Luis A., Linares-Barranco, Bernabé, Stratigopoulos, Haralampos-G..  2021.  Neuron Fault Tolerance in Spiking Neural Networks. 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE). :743–748.
The error-resiliency of Artificial Intelligence (AI) hardware accelerators is a major concern, especially when they are deployed in mission-critical and safety-critical applications. In this paper, we propose a neuron fault tolerance strategy for Spiking Neural Networks (SNNs). It is optimized for low area and power overhead by leveraging observations made from a large-scale fault injection experiment that pinpoints the critical fault types and locations. We describe the fault modeling approach, the fault injection framework, the results of the fault injection experiment, the fault-tolerance strategy, and the fault-tolerant SNN architecture. The idea is demonstrated on two SNNs that we designed for two SNN-oriented datasets, namely the N-MNIST and IBM's DVS128 gesture datasets.
2022-07-01
Owoade, Ayoade Akeem, Osunmakinde, Isaac Olusegun.  2021.  Fault-tolerance to Cascaded Link Failures of Video Traffic on Attacked Wireless Networks. 2021 IST-Africa Conference (IST-Africa). :1–11.
Research has been conducted on wireless network single link failures. However, cascaded link failures due to fraudulent attacks have not received enough attention, whereas this requires solutions. This research developed an enhanced genetic algorithm (EGA) focused on capacity efficiency and fast restoration to rapidly resolve link-link failures. On complex nodes network, this fault-tolerant model was tested for such failures. Optimal alternative routes and the bandwidth required for quick rerouting of video traffic were generated by the proposed model. Increasing cascaded link failures increases bandwidth usage and causes transmission delay, which slows down video traffic routing. The proposed model outperformed popular Dijkstra models, in terms of time complexity. The survived solution paths demonstrate that the proposed model works well in maintaining connectivity despite cascaded link failures and would therefore be extremely useful in pandemic periods on emergency matters. The proposed technology is feasible for current business applications that require high-speed broadband networks.
2022-06-09
Yu, Siyu, Chen, Ningjiang, Liang, Birui.  2021.  Predicting gray fault based on context graph in container-based cloud. 2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). :224–234.
Distributed Container-based cloud system has the advantages of rapid deployment, efficient virtualization, simplified configuration, and well-scalability. However, good scalability may slow down container-based cloud because it is more vulnerable to gray faults. As a new fault model similar with fail-slow and limping, gray fault has so many root causes that current studies focus only on a certain type of fault are not sufficient. And unlike traditional cloud, container is a black box provided by service providers, making it difficult for traditional API intrusion-based diagnosis methods to implement. A better approach should shield low-level causes from high-level processing. A Gray Fault Prediction Strategy based on Context Graph is proposed according to the correlation between gray faults and application scenarios. From historical data, the performance metrics related to how above context evolve to fault scenarios are established, and scenarios represented by corresponding data are stored in a graph. A scenario will be predicted as a fault scenario, if its isomorphic scenario is found in the graph. The experimental results show that the success rate of prediction is stable at more than 90%, and it is verified the overhead is optimized well.
Trestioreanu, Lucian, Nita-Rotaru, Cristina, Malhotra, Aanchal, State, Radu.  2021.  SPON: Enabling Resilient Inter-Ledgers Payments with an Intrusion-Tolerant Overlay. 2021 IEEE Conference on Communications and Network Security (CNS). :92–100.
Payment systems are a critical component of everyday life in our society. While in many situations payments are still slow, opaque, siloed, expensive or even fail, users expect them to be fast, transparent, cheap, reliable and global. Recent technologies such as distributed ledgers create opportunities for near-real-time, cheaper and more transparent payments. However, in order to achieve a global payment system, payments should be possible not only within one ledger, but also across different ledgers and geographies.In this paper we propose Secure Payments with Overlay Networks (SPON), a service that enables global payments across multiple ledgers by combining the transaction exchange provided by the Interledger protocol with an intrusion-tolerant overlay of relay nodes to achieve (1) improved payment latency, (2) fault-tolerance to benign failures such as node failures and network partitions, and (3) resilience to BGP hijacking attacks. We discuss the design goals and present an implementation based on the Interledger protocol and Spines overlay network. We analyze the resilience of SPON and demonstrate through experimental evaluation that it is able to improve payment latency, recover from path outages, withstand network partition attacks, and disseminate payments fairly across multiple ledgers. We also show how SPON can be deployed to make the communication between different ledgers resilient to BGP hijacking attacks.
Khan, Maher, Babay, Amy.  2021.  Toward Intrusion Tolerance as a Service: Confidentiality in Partially Cloud-Based BFT Systems. 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). :14–25.
Recent work on intrusion-tolerance has shown that resilience to sophisticated network attacks requires system replicas to be deployed across at least three geographically distributed sites. While commodity data centers offer an attractive solution for hosting these sites due to low cost and management overhead, their use raises significant confidentiality concerns: system operators may not want private data or proprietary algorithms exposed to servers outside their direct control. We present a new model for Byzantine Fault Tolerant replicated systems that moves toward “intrusion tolerance as a service”. Under this model, application logic and data are only exposed to servers hosted on the system operator's premises. Additional offsite servers hosted in data centers can support the needed resilience without executing application logic or accessing unencrypted state. We have implemented this approach in the open-source Spire system, and our evaluation shows that the performance overhead of providing confidentiality can be less than 4% in terms of latency.
2022-05-24
Liu, Yizhong, Xia, Yu, Liu, Jianwei, Hei, Yiming.  2021.  A Secure and Decentralized Reconfiguration Protocol For Sharding Blockchains. 2021 7th IEEE Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :111–116.
Most present reconfiguration methods in sharding blockchains rely on a secure randomness, whose generation might be complicated. Besides, a reference committee is usually in charge of the reconfiguration, making the process not decentralized. To address the above issues, this paper proposes a secure and decentralized shard reconfiguration protocol, which allows each shard to complete the selection and confirmation of its own shard members in turn. The PoW mining puzzle is calculated using the public key hash value in the member list confirmed by the last shard. Through the mining and shard member list commitment process, each shard can update its members safely and efficiently once in a while. Furthermore, it is proved that our protocol satisfies the safety, consistency, liveness, and decentralization properties. The honest member proportion in each confirmed shard member list is guaranteed to exceed a certain safety threshold, and all honest nodes have an identical view on the list. The reconfiguration is ensured to make progress, and each node has the same right to participate in the process. Our secure and decentralized shard reconfiguration protocol could be applied to all committee-based sharding blockchains.
2022-05-19
Wu, Peiyan, Chen, Wenbin, Wu, Hualin, Qi, Ke, Liu, Miao.  2021.  Enhanced Game Theoretical Spectrum Sharing Method Based on Blockchain Consensus. 2021 IEEE 94th Vehicular Technology Conference (VTC2021-Fall). :1–7.
The limited spectrum resources need to provide safe and efficient spectrum service for the intensive users. Malicious spectrum work nodes will affect the normal operation of the entire system. Using the blockchain model, consensus algorithm Praft based on optimized Raft is to solve the consensus problem in Byzantine environment. Message digital signatures give the spectrum node some fault tolerance and tamper resistance. Spectrum sharing among spectrum nodes is carried out in combination with game theory. The existing game theoretical algorithm does not consider the influence of spectrum occupancy of primary users and cognitive users on primary users' utility and enthusiasm at the same time. We elicits a reinforcement factor and analyzes the effect of the reinforcement factor on strategy performance. This scheme optimizes the previous strategy so that the profits of spectrum nodes are improved and a good Nash equilibrium is shown, while Praft solves the Byzantine problem left by Raft.
2022-03-01
Wang, Weidong, Zheng, Yufu, Bao, Yeling, Shui, Shengkun, Jiang, Tao.  2021.  Modulated Signal Recognition Based on Feature-Multiplexed Convolutional Neural Networks. 2021 IEEE 2nd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA). 2:621–624.
Modulated signal identification plays a crucial role in both military reconnaissance and civilian signal regulation. Traditionally, modulated signal identification is based on high-order statistics, but this approach has many drawbacks. With the development of deep learning, its advantages are fully exploited by combining it with modulated signals to avoid the complex process of computing a priori knowledge while having good fault tolerance. In this paper, ten digital modulated signals are classified and recognized, and improvements are made on the basis of convolutional neural networks, using feature reuse to increase the depth of the convolutional layer and extract signal features with better results. After experimental analysis, the recognition accuracy increases with the rise of the signal-to-noise ratio, and can reach 90% and above when the signal-to-noise ratio is 30dB.
2021-12-20
Silva, Douglas Simões, Graczyk, Rafal, Decouchant, Jérémie, Völp, Marcus, Esteves-Verissimo, Paulo.  2021.  Threat Adaptive Byzantine Fault Tolerant State-Machine Replication. 2021 40th International Symposium on Reliable Distributed Systems (SRDS). :78–87.
Critical infrastructures have to withstand advanced and persistent threats, which can be addressed using Byzantine fault tolerant state-machine replication (BFT-SMR). In practice, unattended cyberdefense systems rely on threat level detectors that synchronously inform them of changing threat levels. However, to have a BFT-SMR protocol operate unattended, the state-of-the-art is still to configure them to withstand the highest possible number of faulty replicas \$f\$ they might encounter, which limits their performance, or to make the strong assumption that a trusted external reconfiguration service is available, which introduces a single point of failure. In this work, we present ThreatAdaptive the first BFT-SMR protocol that is automatically strengthened or optimized by its replicas in reaction to threat level changes. We first determine under which conditions replicas can safely reconfigure a BFT-SMR system, i.e., adapt the number of replicas \$n\$ and the fault threshold \$f\$ so as to outpace an adversary. Since replicas typically communicate with each other using an asynchronous network they cannot rely on consensus to decide how the system should be reconfigured. ThreatAdaptive avoids this pitfall by proactively preparing the reconfiguration that may be triggered by an increasing threat when it optimizes its performance. Our evaluation shows that ThreatAdaptive can meet the latency and throughput of BFT baselines configured statically for a particular level of threat, and adapt 30% faster than previous methods, which make stronger assumptions to provide safety.
Najafi, Maryam, Khoukhi, Lyes, Lemercier, Marc.  2021.  A Multidimensional Trust Model for Vehicular Ad-Hoc Networks. 2021 IEEE 46th Conference on Local Computer Networks (LCN). :419–422.
In this paper, we propose a multidimensional trust model for vehicular networks. Our model evaluates the trustworthiness of each vehicle using two main modes: 1) Direct Trust Computation DTC related to a direct connection between source and target nodes, 2) Indirect Trust Computation ITC related to indirectly communication between source and target nodes. The principal characteristics of this model are flexibility and high fault tolerance, thanks to an automatic trust scores assessment. In our extensive simulations, we use Total Cost Rate to affirm the performance of the proposed trust model.
2021-09-01
Wang, Zizhong, Wang, Haixia, Shao, Airan, Wang, Dongsheng.  2020.  An Adaptive Erasure-Coded Storage Scheme with an Efficient Code-Switching Algorithm. 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). :1177—1178.
Using erasure codes increases consumption of network traffic and disk I/O tremendously when systems recover data, resulting in high latency of degraded reads. In order to mitigate this problem, we present an adaptive storage scheme based on data access skew, a fact that most data accesses are applied in a small fraction of data. In this scheme, we use both Local Reconstruction Code (LRC), whose recovery cost is low, to store frequently accessed data, and Hitchhiker (HH) code, which guarantees minimum storage cost, to store infrequently accessed data. Besides, an efficient switching algorithm between LRC and HH code with low network and computation costs is provided. The whole system will benefit from low degraded read latency while keeping a low storage overhead, and code-switching will not become a bottleneck.
2021-08-17
Jaiswal, Ayshwarya, Dwivedi, Vijay Kumar, Yadav, Om Prakash.  2020.  Big Data and its Analyzing Tools : A Perspective. 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS). :560–565.
Data are generated and stored in databases at a very high speed and hence it need to be handled and analyzed properly. Nowadays industries are extensively using Hadoop and Spark to analyze the datasets. Both the frameworks are used for increasing processing speeds in computing huge complex datasets. Many researchers are comparing both of them. Now, the big questions arising are, Is Spark a substitute for Hadoop? Is hadoop going to be replaced by spark in mere future?. Spark is “built on top of” Hadoop and it extends the model to deploy more types of computations which incorporates Stream Processing and Interactive Queries. No doubt, Spark's execution speed is much faster than Hadoop, but talking in terms of fault tolerance, hadoop is slightly more fault tolerant than spark. In this article comparison of various bigdata analytics tools are done and Hadoop and Spark are discussed in detail. This article further gives an overview of bigdata, spark and hadoop issues. In this survey paper, the approaches to resolve the issues of spark and hadoop are discussed elaborately.
2021-08-11
Hossain, Md. Sajjad, Bushra Islam, Fabliha, Ifeanyi Nwakanma, Cosmas, Min Lee, Jae, Kim, Dong-Seong.  2020.  Decentralized Latency-aware Edge Node Grouping with Fault Tolerance for Internet of Battlefield Things. 2020 International Conference on Information and Communication Technology Convergence (ICTC). :420–423.
In this paper, our objective is to focus on the recent trend of military fields where they brought Internet of Things (IoT) to have better impact on the battlefield by improving the effectiveness and this is called Internet of Battlefield Things(IoBT). Due to the requirements of high computing capability and minimum response time with minimum fault tolerance this paper proposed a decentralized IoBT architecture. The proposed method can increase the reliability in the battlefield environment by searching the reliable nodes among all the edge nodes in the environment, and by adding the fault tolerance in the edge nodes will increase the effectiveness of overall battlefield scenario. This suggested fault tolerance approach is worth for decentralized mode to handle the issue of latency requirements and maintaining the task reliability of the battlefield. Our experimental results ensure the effectiveness of the proposed approach as well as enjoy the requirements of latency-aware military field while ensuring the overall reliability of the network.
2021-07-27
Loreti, Daniela, Artioli, Marcello, Ciampolini, Anna.  2020.  Solving Linear Systems on High Performance Hardware with Resilience to Multiple Hard Faults. 2020 International Symposium on Reliable Distributed Systems (SRDS). :266–275.
As large-scale linear equation systems are pervasive in many scientific fields, great efforts have been done over the last decade in realizing efficient techniques to solve such systems, possibly relying on High Performance Computing (HPC) infrastructures to boost the performance. In this framework, the ever-growing scale of supercomputers inevitably increases the frequency of faults, making it a crucial issue of HPC application development.A previous study [1] investigated the possibility to enhance the Inhibition Method (IMe) -a linear systems solver for dense unstructured matrices-with fault tolerance to single hard errors, i.e. failures causing one computing processor to stop.This article extends [1] by proposing an efficient technique to obtain fault tolerance to multiple hard errors, which may occur concurrently on different processors belonging to the same or different machines. An improved parallel implementation is also proposed, which is particularly suitable for HPC environments and moves towards the direction of a complete decentralization. The theoretical analysis suggests that the technique (which does not require check pointing, nor rollback) is able to provide fault tolerance to multiple faults at the price of a small overhead and a limited number of additional processors to store the checksums. Experimental results on a HPC architecture validate the theoretical study, showing promising performance improvements w.r.t. a popular fault-tolerant solving technique.
2021-06-24
Jang, Dongsoo, Shin, Michael, Pathirage, Don.  2020.  Security Fault Tolerance for Access Control. 2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion (ACSOS-C). :212—217.
This paper describes an approach to the security fault tolerance of access control in which the security breaches of an access control are tolerated by means of a security fault tolerant (SFT) access control. Though an access control is securely designed and implemented, it can contain faults in development or be contaminated in operation. The threats to an access control are analyzed to identify possible security breaches. To tolerate the security breaches, an SFT access control is made to be semantically identical to an access control. Our approach is described using role-based access control (RBAC) and extended access control list (EACL). A healthcare system is used to demonstrate our approach.