Parallel Sorting Of Biological Sequences Using The Intel� Concurrent Collections

Nembhard, Fitzroy

Parallel Sorting Of Biological Sequences Using The Intel� Concurrent Collections

dc.contributor.advisor	Lupton, William
dc.contributor.advisor	Stojkovic, Vojislav
dc.contributor.author	Nembhard, Fitzroy
dc.contributor.department	Computer Science and Bioinformatics Program	en_US
dc.contributor.program	Master of Science	en_US
dc.date.accessioned	2018-04-27T15:38:53Z
dc.date.available	2018-04-27T15:38:53Z
dc.date.issued	2012
dc.description.abstract	Performing analyses of and computations with biological sequence data, such as deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), require a lot of processing time and memory using sequential algorithms. Today, programmers and scientists have developed and tested a few models for parallelizing and optimizing algorithms to improve results in bioinformatics. However, some of these approaches have not made efficient use of multi-core systems or computers with many processors. The Intel® Concurrent Collections is a software tool and library for transforming serial programs into semantically equivalent parallel programs. The Intel® Concurrent Collections approach is a new and unique technique for designing parallel programs. It overcomes the over-constraint nature of serial languages by providing a conclusive programming concept and allows for programs to be run efficiently on multi-core systems and computers with many processors. The main goals of this research are: to design a serial C/C++ program for sorting biological sequences based on the Divide and Conquer methodology, to transform the serial C/C++ program into a semantically equivalent parallel C/C++ program using the Intel® Concurrent Collections, to compare and analyze execution times of the serial and parallel programs and to make appropriate conclusions on the suitability of the Divide and Conquer methodology for parallelization, to provide suggestions on the suitability of the Intel® Concurrent Collections technology for parallelization of serial algorithms, and to show the importance of parallelization of bioinformatics algorithms. The main results/achievements of this thesis research are: successful parallel sorting of biological sequences using the merge sort Divide and Conquer algorithm, successfully conducted experiments on the Intel® Many-core Testing Lab, which runs with the RedHat Enterprise Linux operating system and is comprised of 32 processors and 265GB RAM, proof that the Intel® Concurrent Collections programming model can, by parallelization, improve efficiency and speed of algorithms involved in bioinformatics and computational biology, and a conclusion that there are some limitations in the prerelease version of the platform.
dc.genre	theses
dc.identifier	doi:10.13016/M27W67830
dc.identifier.uri	http://hdl.handle.net/11603/10408
dc.language.iso	en
dc.relation.isAvailableAt	Morgan State University
dc.rights	This item is made available by Morgan State University for personal, educational, and research purposes in accordance with Title 17 of the U.S. Copyright Law. Other uses may require permission from the copyright owner.
dc.subject	Parallel algorithms	en_US
dc.subject	High performance computing	en_US
dc.subject	Bioinformatics	en_US
dc.subject	Biology	en_US
dc.subject	Computer science	en_US
dc.subject	Parallel processing (Electronic computers)	en_US
dc.title	Parallel Sorting Of Biological Sequences Using The Intel� Concurrent Collections
dc.type	Text

Collections

MSU Student Research Collection

Parallel Sorting Of Biological Sequences Using The Intel� Concurrent Collections

Files

Collections