Programmable Manycore Accelerator for Markov Chain Monte Carlo

Author/Creator

Author/Creator ORCID

Date

2018-01-01

Department

Computer Science and Electrical Engineering

Program

Engineering, Computer

Citation of Original Publication

Rights

Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Subjects

Abstract

Markov Chain Monte Carlo(MCMC) methods are a class of algorithms used to sample from a probability distribution function (PDF). MCMC samplers are used in machine learning, image and signal processing applications that are computationally intensive. In such scenarios, high-throughput samplers are of paramount importance. In this theses, we propose a domain-specific programmable manycore accelerator for MCMC algorithms called "MC3A"- Markov Chain Monte Carlo ManyCore Accelerator, which effectively generates samples from a provided target distribution. MC3A is built upon an existing manycore named PENC that was developed in the EEHPC Lab, by adding dedicated hardware instructions to accelerate the performance of MCMC. The instructions primarily do the tasks of calculating exponent functions (EXP), uniform random numbers (RAN) and Gaussian random numbers (GNG) reducing the number of clock cycles required to implement the corresponding functions by one to two orders of magnitude. A class of MCMC methods called the Metropolis-Hastings algorithm is used in the experiment to show how the proposed hardware could significantly increase the throughput. In addition to proposing MC3A, we also propose a new cognitive computing approach to real-time and online seizure detection with minimal power consumption and latency, suitable for wearable devices. We use Metropolis-Hastings sampler to sample from a PDF that adapts its parameters to a patient's real-time signals, to detect the occurrence of seizures in their brain. We use Gaussian Mixture Model (GMM) to model the likelihood PDF, whose parameters are tuned to the patient's real-time signals, and over which the Metropolis-Hastings sampler generates samples with respect to time. The generated samples are compared with the actual online signals to detect the occurrence of seizures in the brain. Using this approach, we achieved an average seizure detection accuracy of 81.47%, an average Sensitivity of 90% and onset Sensitivity of 100%, that outperform those of traditional machine learning algorithms such as Support Vector Machine (SVM). The hardware of the complete system has been implemented on Artix7 FPGA at 200 MHz minimizing energy/power requirements with a logic utilization of 1198 slices and a dynamic power of 3.37 uW, that outperforms by 3 times and 39 times respectively to those of the SVM implementation on a similar platform. A 64-cluster architecture of the MC3A is fully placed and routed in 65 nm, TSMC CMOS technology, where the VLSI layout of each cluster occupies an area of 0.528 mm^2 while consuming a power of 319 mW running at 1 GHz clock frequency. Our proposed MC3A achieves 6x higher throughput than its equivalent predecessor (PENC) and consumes 4x lower energy per sample. When the application is scaled to the size of 10,000 samples for 100 channels, MC3A consumes 5.2x lower energy compared to PENC while the throughput increases by 8x. Also, when compared to other off-the-shelf platforms, such as Jetson TX1 and TX2 SoC, MC3A results in 195x and 191x higher throughput and consumes 3379x and 3037x lower energy per sample generation, respectively.