Low-Power Manycore Accelerator for Personalized Biomedical Applications
Title | Low-Power Manycore Accelerator for Personalized Biomedical Applications |
Publication Type | Conference Paper |
Year of Publication | 2016 |
Authors | Page, Adam, Attaran, Nasrin, Shea, Colin, Homayoun, Houman, Mohsenin, Tinoosh |
Conference Name | Proceedings of the 26th Edition on Great Lakes Symposium on VLSI |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-4274-2 |
Keywords | Accelerator, biomedical, composability, Digital signal processing, embedded processors, FPGA, low power, machine learning, manycore, Metrics, privacy, pubcrawl, Resiliency, signal processing security |
Abstract | Wearable personal health monitoring systems can offer a cost effective solution for human healthcare. These systems must provide both highly accurate, secured and quick processing and delivery of vast amount of data. In addition, wearable biomedical devices are used in inpatient, outpatient, and at home e-Patient care that must constantly monitor the patient's biomedical and physiological signals 24/7. These biomedical applications require sampling and processing multiple streams of physiological signals with strict power and area footprint. The processing typically consists of feature extraction, data fusion, and classification stages that require a large number of digital signal processing and machine learning kernels. In response to these requirements, in this paper, a low-power, domain-specific many-core accelerator named Power Efficient Nano Clusters (PENC) is proposed to map and execute the kernels of these applications. Experimental results show that the manycore is able to reduce energy consumption by up to 80% and 14% for DSP and machine learning kernels, respectively, when optimally parallelized. The performance of the proposed PENC manycore when acting as a coprocessor to an Intel Atom processor is compared with existing commercial off-the-shelf embedded processing platforms including Intel Atom, Xilinx Artix-7 FPGA, and NVIDIA TK1 ARM-A15 with GPU SoC. The results show that the PENC manycore architecture reduces the energy by as much as 10X while outperforming all off-the-shelf embedded processing platforms across all studied machine learning classifiers. |
URL | http://doi.acm.org/10.1145/2902961.2902986 |
DOI | 10.1145/2902961.2902986 |
Citation Key | page_low-power_2016 |