Design Space Exploration of Data-centric Architectures
dc.contributor.advisor | Halem, Milton | |
dc.contributor.author | Prathapan, Smriti | |
dc.contributor.department | Computer Science and Electrical Engineering | |
dc.contributor.program | Computer Science | |
dc.date.accessioned | 2021-09-01T13:55:21Z | |
dc.date.available | 2021-09-01T13:55:21Z | |
dc.date.issued | 2020-01-20 | |
dc.description.abstract | The era of "big data" is leading to changes in the compute paradigm, in particular to the notion of moving computation to data, known as Near Data Processing (NDP). Technological advancements have enabled the application of NDP at many levels of the memory hierarchy from cache to DRAM, from non-volatile storage-class memory to processors embedded in storage devices. This dissertations explores the effectiveness of data-centric compute architectures using Active Storage, Processing-in-Memory and Coherent Access Processor Interface (CAPI) accelerated Flash storage. We developed a compute framework Active In-Storage (AiSTOR) that enables scalable distributed Big Data Processing by directly performing the computations on active storage devices. AiSTOR has the following three major advantages: (i) active storage utilizes the processor capabilities on the storage devices and this significantly reducing the bandwidth requirement of the network, (ii) the computations can take advantage of the inherent map/reduce parallelism by using the array of the distributed storage processors available on the active data storage devices, thereby aggregating the processing power of a cluster of active devices, (iii) it can perform coherent processing of streaming data as it arrives on the storage devices. We define a generic NDP architecture which is well-suited for memory-bound computations and implement the software kernels for NDP-based algorithmic mapping.We show for a modest sized NDP system, that the AiSTOR architecture framework employing distributed processing algorithms can yield efficient and accurate computational processing performance. In comparison with Hadoop based MapReduce, the compute times on AiSTOR has significant performance benefits by up to 18%, while providing very competitive results compared to Spark-based in-memory processing. The effectiveness of the NDP architecture is demonstrated by evaluating the row-buffer management policies (open-page and closed-page) with the controller modifications and a generic unmodified architecture. The PIM open-page policy has 52% higher operation throughput than the host and 3.7% higher throughput and 50% lesser DRAM activations than PIM-closed page policy. Further, this dissertations explores the potential impact of hidden CPU usage in handling the IO requests in heterogeneous storage systems when using CAPIFlash library and finding the optimal IO/s and OP/s for heterogeneous storage memory devices such as NVM, SSD and RAM. FS900 with CAPI, when using the RAM metadata cache, performed 2x as many read OP/s in synchronous mode and 3x in asynchronous mode. SSD and NVM had 66% higher IO/s in comparison with RAM in asynchronous mode. | |
dc.format | application:pdf | |
dc.genre | dissertations | |
dc.identifier | doi:10.13016/m2ctv2-wote | |
dc.identifier.other | 12276 | |
dc.identifier.uri | http://hdl.handle.net/11603/22824 | |
dc.language | en | |
dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department Collection | |
dc.relation.ispartof | UMBC Theses and Dissertations Collection | |
dc.relation.ispartof | UMBC Graduate School Collection | |
dc.relation.ispartof | UMBC Student Collection | |
dc.source | Original File Name: Prathapan_umbc_0434D_12276.pdf | |
dc.subject | Active Storage | |
dc.subject | Coherent Accelerator Processor Interface | |
dc.subject | Data-centric computing | |
dc.subject | Near Data Computing | |
dc.subject | Processing in Memory | |
dc.title | Design Space Exploration of Data-centric Architectures | |
dc.type | Text | |
dcterms.accessRights | Distribution Rights granted to UMBC by the author. | |
dcterms.accessRights | This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu |