ARETE: Accurate Error Assessment via Machine Learning-Guided Dynamic-Timing Analysis

IEEE Transactions on Computers(2023)

引用 2|浏览7
Nanometer circuits are increasingly prone to timing errors, escalating the need for fault injection frameworks to accurately evaluate their impact on applications. In this paper, we propose ARETE, a novel cross-layer, fault-injection framework that combines dynamic-binary instrumentation with machine learning-guided dynamic-timing analysis. ARETE enables accurate fault-injection into any application by estimating the location of the injecting errors via dynamic-timing analysis. To accelerate fault-injection, we develop a novel, data-aware, machine learning-based mechanism that dynamically pre-selects the error-prone instructions and limits the application of the costly dynamic-timing analysis only to them. To evaluate ARETE's accuracy, our fully automated toolflow is configured to support fault-injection based on detailed post-layout gate-level simulations as well as via existing workload-agnostic error models. Our results for various workloads, including an autonomous-driving library, show that the location and time of injected errors performed by ARETE, is 89.9% consistent with fault-injection based on full gate-level simulation. On average, ARETE executes 84.6x faster than gate-level simulation and at a cost of 3.4% loss in the program output quality estimation. When compared to the existing statistical fault-injection tools that are based on workload-agnostic error models, ARETE improves the accuracy of fault-injection rate and output quality estimation by 143.9% and 40.4% on average, respectively.
Circuit faults,Integrated circuit modeling,Delays,Logic gates,Computational modeling,Pipelines,Microarchitecture,Cross-layer fault injection,dynamic binary instrumentation,dynamic timing analysis,fault injection,machine learning,timing error evaluation
AI 理解论文
Chat Paper