Investigating the Impact of High-Level Software Design on Low-Level Hardware Fault Resilience
- Resource Type
- Conference
- Authors
- Zhang, Bohan; Yang, Lishan; Li, Guanpeng; Xu, Hui
- Source
- 2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks - Supplemental Volume (DSN-S) DSN-S Dependable Systems and Networks - Supplemental Volume (DSN-S), 2023 53rd Annual IEEE/IFIP International Conference on. :163-167 Jun, 2023
- Subject
- Aerospace
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineering Profession
Robotics and Control Systems
Transportation
Software design
Software algorithms
Writing
Reliability engineering
Software
Software reliability
Safety
Silent Data Corruption
Error Resilience
Fault Injection
SDC
Program Analysis
Software Testing
- Language
- ISSN
- 2833-292X
Silent Data Corruptions (SDCs) have become an insurmountable issue that threatens the system reliability. General strategies for protecting programs from SDCs, such as dual modular redundancy, incur intolerable overheads. Another strategy is Algorithm-Based Fault Tolerance which is highly bounded to the specific algorithm and hard to generalize. In this study, we find different implementations of the same algorithm may lead to very different SDC probabilities. We conduct a characterization study to quantify the differences and investigate the root causes. The insights we derive could help and guide the developers in software engineering domain to design programs that is naturally resilient.