Visible to the public RAT: A Lightweight System-Level Soft Error Mitigation Technique

TitleRAT: A Lightweight System-Level Soft Error Mitigation Technique
Publication TypeConference Paper
Year of Publication2020
AuthorsGava, Jonas, Reis, Ricardo, Ost, Luciano
Conference Name2020 IFIP/IEEE 28th International Conference on Very Large Scale Integration (VLSI-SOC)
Date Publishedoct
KeywordsFault tolerance, Metrics, mitigation technique, multicore, multicore computing security, Multicore processing, pubcrawl, Redundancy, Registers, reliability, resilience, Resiliency, Safety, Scalability, soft error reliability, Tools, Very large scale integration
AbstractTo achieve a substantial reliability and safety level, it is imperative to provide electronic computing systems with appropriate mechanisms to tackle soft errors. This paper proposes a low-cost system-level soft error mitigation technique, which allocates the critical application function to a pool of specific general-purpose processor registers. Both the critical function and the register pool are automatically selected by a developed profiling tool. The proposed technique was validated through more than 320K fault injections considering a Linux kernel, different benchmarks and two multicore ARM processors. Results show that our technique significantly reduces the code size and performance overheads while providing reliability improvement, w.r.t. the Triple Modular Redundancy (TMR) technique.
DOI10.1109/VLSI-SOC46417.2020.9344080
Citation Keygava_rat_2020