Visible to the public 14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with \#x003E;0.1 timing error rate tolerance for IoT applications

Title14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with \#x003E;0.1 timing error rate tolerance for IoT applications
Publication TypeConference Paper
Year of Publication2017
AuthorsWhatmough, P. N., Lee, S. K., Lee, H., Rama, S., Brooks, D., Wei, G. Y.
Conference Name2017 IEEE International Solid-State Circuits Conference (ISSCC)
Keywords1.2GHz 568nJ/prediction sparse deep-neural-network engine, 28nm SoC, aggregate timing violation rates, algorithmic error tolerance, algorithmic resilience, circuit resilience, circuit-level timing violation tolerance, data sparsity, datapath logic, Energy efficiency, Engines, Error analysis, FCLK scaling, frequency 1 GHz, frequency 1.2 GHz, frequency 667 MHz, Internet of Things, IoT applications, neural nets, Neural Network Resilience, Program processors, programmable FC-DNN accelerator design, pubcrawl, Razor timing violation detection, resilience, Resiliency, sign-magnitude number format, system-on-chip, Throughput, Timing, timing error rate tolerance, VDD scaling
Abstract

This paper presents a 28nm SoC with a programmable FC-DNN accelerator design that demonstrates: (1) HW support to exploit data sparsity by eliding unnecessary computations (4x energy reduction); (2) improved algorithmic error tolerance using sign-magnitude number format for weights and datapath computation; (3) improved circuit-level timing violation tolerance in datapath logic via timeborrowing; (4) combined circuit and algorithmic resilience with Razor timing violation detection to reduce energy via VDD scaling or increase throughput via FCLK scaling; and (5) high classification accuracy (98.36% for MNIST test set) while tolerating aggregate timing violation rates \textbackslashtextgreater10-1. The accelerator achieves a minimum energy of 0.36mJ/pred at 667MHz, maximum throughput at 1.2GHz and 0.57mJ/pred, or a 10%-margined operating point at 1GHz and 0.58mJ/pred.

URLhttps://ieeexplore.ieee.org/document/7870351/
DOI10.1109/ISSCC.2017.7870351
Citation Keywhatmough_14.3_2017