Visible to the public Biblio

Filters: Keyword is Hardware accelerator  [Clear All Filters]
2022-03-01
Huang, Shanshi, Peng, Xiaochen, Jiang, Hongwu, Luo, Yandong, Yu, Shimeng.  2021.  Exploiting Process Variations to Protect Machine Learning Inference Engine from Chip Cloning. 2021 IEEE International Symposium on Circuits and Systems (ISCAS). :1–5.
Machine learning inference engine is of great interest to smart edge computing. Compute-in-memory (CIM) architecture has shown significant improvements in throughput and energy efficiency for hardware acceleration. Emerging nonvolatile memory (eNVM) technologies offer great potentials for instant on and off by dynamic power gating. Inference engine is typically pre-trained by the cloud and then being deployed to the field. There is a new security concern on cloning of the weights stored on eNVM-based CIM chip. In this paper, we propose a countermeasure to the weight cloning attack by exploiting the process variations of the periphery circuitry. In particular, we use weight fine-tuning to compensate the analog-to-digital converter (ADC) offset for a specific chip instance while inducing significant accuracy drop for cloned chip instances. We evaluate our proposed scheme on a CIFAR-10 classification task using a VGG- 8 network. Our results show that with precisely chosen transistor size on the employed SAR-ADC, we could maintain 88% 90% accuracy for the fine-tuned chip while the same set of weights cloned on other chips will only have 20 40% accuracy on average. The weight fine-tune could be completed within one epoch of 250 iterations. On average only 0.02%, 0.025%, 0.142% of cells are updated for 2-bit, 4-bit, 8-bit weight precisions in each iteration.
2022-02-07
Yang, Chen, Yang, Zepeng, Hou, Jia, Su, Yang.  2021.  A Lightweight Full Homomorphic Encryption Scheme on Fully-connected Layer for CNN Hardware Accelerator achieving Security Inference. 2021 28th IEEE International Conference on Electronics, Circuits, and Systems (ICECS). :1–4.
The inference results of neural network accelerators often involve personal privacy or business secrets in intelligent systems. It is important for the safety of convolutional neural network (CNN) accelerator to prevent the key data and inference result from being leaked. The latest CNN models have started to combine with fully homomorphic encryption (FHE), ensuring the data security. However, the computational complexity, data storage overhead, inference time are significantly increased compared with the traditional neural network models. This paper proposed a lightweight FHE scheme on fully-connected layer for CNN hardware accelerator to achieve security inference, which not only protects the privacy of inference results, but also avoids excessive hardware overhead and great performance degradation. Compared with state-of-the-art works, this work reduces computational complexity by approximately 90% and decreases ciphertext size by 87%∼95%.
2020-10-06
Yousefzadeh, Saba, Basharkhah, Katayoon, Nosrati, Nooshin, Sadeghi, Rezgar, Raik, Jaan, Jenihhin, Maksim, Navabi, Zainalabedin.  2019.  An Accelerator-based Architecture Utilizing an Efficient Memory Link for Modern Computational Requirements. 2019 IEEE East-West Design Test Symposium (EWDTS). :1—6.

Hardware implementation of many of today's applications such as those in automotive, telecommunication, bio, and security, require heavy repeated computations, and concurrency in the execution of these computations. These requirements are not easily satisfied by existing embedded systems. This paper proposes an embedded system architecture that is enhanced by an array of accelerators, and a bussing system that enables concurrency in operation of accelerators. This architecture is statically configurable to configure it for performing a specific application. The embedded system architecture and architecture of the configurable accelerators are discussed in this paper. A case study examines an automotive application running on our proposed system.

2018-09-28
Kung, Jaeha, Long, Yun, Kim, Duckhwan, Mukhopadhyay, Saibal.  2017.  A Programmable Hardware Accelerator for Simulating Dynamical Systems. Proceedings of the 44th Annual International Symposium on Computer Architecture. :403–415.
The fast and energy-efficient simulation of dynamical systems defined by coupled ordinary/partial differential equations has emerged as an important problem. The accelerated simulation of coupled ODE/PDE is critical for analysis of physical systems as well as computing with dynamical systems. This paper presents a fast and programmable accelerator for simulating dynamical systems. The computing model of the proposed platform is based on multilayer cellular nonlinear network (CeNN) augmented with nonlinear function evaluation engines. The platform can be programmed to accelerate wide classes of ODEs/PDEs by modulating the connectivity within the multilayer CeNN engine. An innovative hardware architecture including data reuse, memory hierarchy, and near-memory processing is designed to accelerate the augmented multilayer CeNN. A dataflow model is presented which is supported by optimized memory hierarchy for efficient function evaluation. The proposed solver is designed and synthesized in 15nm technology for the hardware analysis. The performance is evaluated and compared to GPU nodes when solving wide classes of differential equations and the power consumption is analyzed to show orders of magnitude improvement in energy efficiency.