14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with \#x003E;0.1 timing error rate tolerance for IoT applications
Title | 14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with \#x003E;0.1 timing error rate tolerance for IoT applications |
Publication Type | Conference Paper |
Year of Publication | 2017 |
Authors | Whatmough, P. N., Lee, S. K., Lee, H., Rama, S., Brooks, D., Wei, G. Y. |
Conference Name | 2017 IEEE International Solid-State Circuits Conference (ISSCC) |
Keywords | 1.2GHz 568nJ/prediction sparse deep-neural-network engine, 28nm SoC, aggregate timing violation rates, algorithmic error tolerance, algorithmic resilience, circuit resilience, circuit-level timing violation tolerance, data sparsity, datapath logic, Energy efficiency, Engines, Error analysis, FCLK scaling, frequency 1 GHz, frequency 1.2 GHz, frequency 667 MHz, Internet of Things, IoT applications, neural nets, Neural Network Resilience, Program processors, programmable FC-DNN accelerator design, pubcrawl, Razor timing violation detection, resilience, Resiliency, sign-magnitude number format, system-on-chip, Throughput, Timing, timing error rate tolerance, VDD scaling |
Abstract | This paper presents a 28nm SoC with a programmable FC-DNN accelerator design that demonstrates: (1) HW support to exploit data sparsity by eliding unnecessary computations (4x energy reduction); (2) improved algorithmic error tolerance using sign-magnitude number format for weights and datapath computation; (3) improved circuit-level timing violation tolerance in datapath logic via timeborrowing; (4) combined circuit and algorithmic resilience with Razor timing violation detection to reduce energy via VDD scaling or increase throughput via FCLK scaling; and (5) high classification accuracy (98.36% for MNIST test set) while tolerating aggregate timing violation rates \textbackslashtextgreater10-1. The accelerator achieves a minimum energy of 0.36mJ/pred at 667MHz, maximum throughput at 1.2GHz and 0.57mJ/pred, or a 10%-margined operating point at 1GHz and 0.58mJ/pred. |
URL | https://ieeexplore.ieee.org/document/7870351/ |
DOI | 10.1109/ISSCC.2017.7870351 |
Citation Key | whatmough_14.3_2017 |
- Internet of Things
- VDD scaling
- timing error rate tolerance
- timing
- Throughput
- system-on-chip
- sign-magnitude number format
- Resiliency
- resilience
- Razor timing violation detection
- pubcrawl
- programmable FC-DNN accelerator design
- Program processors
- Neural Network Resilience
- neural nets
- IoT applications
- 1.2GHz 568nJ/prediction sparse deep-neural-network engine
- frequency 667 MHz
- frequency 1.2 GHz
- frequency 1 GHz
- FCLK scaling
- Error analysis
- Engines
- Energy Efficiency
- datapath logic
- data sparsity
- circuit-level timing violation tolerance
- circuit resilience
- algorithmic resilience
- algorithmic error tolerance
- aggregate timing violation rates
- 28nm SoC