Parallel Sorting Of Biological Sequences Using The Intel� Concurrent Collections

dc.contributor.advisorLupton, William
dc.contributor.advisorStojkovic, Vojislav
dc.contributor.authorNembhard, Fitzroy
dc.contributor.departmentComputer Science and Bioinformatics Programen_US
dc.contributor.programMaster of Scienceen_US
dc.date.accessioned2018-04-27T15:38:53Z
dc.date.available2018-04-27T15:38:53Z
dc.date.issued2012
dc.description.abstractPerforming analyses of and computations with biological sequence data, such as deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), require a lot of processing time and memory using sequential algorithms. Today, programmers and scientists have developed and tested a few models for parallelizing and optimizing algorithms to improve results in bioinformatics. However, some of these approaches have not made efficient use of multi-core systems or computers with many processors. The Intel® Concurrent Collections is a software tool and library for transforming serial programs into semantically equivalent parallel programs. The Intel® Concurrent Collections approach is a new and unique technique for designing parallel programs. It overcomes the over-constraint nature of serial languages by providing a conclusive programming concept and allows for programs to be run efficiently on multi-core systems and computers with many processors. The main goals of this research are: to design a serial C/C++ program for sorting biological sequences based on the Divide and Conquer methodology, to transform the serial C/C++ program into a semantically equivalent parallel C/C++ program using the Intel® Concurrent Collections, to compare and analyze execution times of the serial and parallel programs and to make appropriate conclusions on the suitability of the Divide and Conquer methodology for parallelization, to provide suggestions on the suitability of the Intel® Concurrent Collections technology for parallelization of serial algorithms, and to show the importance of parallelization of bioinformatics algorithms. The main results/achievements of this thesis research are: successful parallel sorting of biological sequences using the merge sort Divide and Conquer algorithm, successfully conducted experiments on the Intel® Many-core Testing Lab, which runs with the RedHat Enterprise Linux operating system and is comprised of 32 processors and 265GB RAM, proof that the Intel® Concurrent Collections programming model can, by parallelization, improve efficiency and speed of algorithms involved in bioinformatics and computational biology, and a conclusion that there are some limitations in the prerelease version of the platform.
dc.genretheses
dc.identifierdoi:10.13016/M27W67830
dc.identifier.urihttp://hdl.handle.net/11603/10408
dc.language.isoen
dc.relation.isAvailableAtMorgan State University
dc.rightsThis item is made available by Morgan State University for personal, educational, and research purposes in accordance with Title 17 of the U.S. Copyright Law. Other uses may require permission from the copyright owner.
dc.subjectParallel algorithmsen_US
dc.subjectHigh performance computingen_US
dc.subjectBioinformaticsen_US
dc.subjectBiologyen_US
dc.subjectComputer scienceen_US
dc.subjectParallel processing (Electronic computers)en_US
dc.titleParallel Sorting Of Biological Sequences Using The Intel� Concurrent Collections
dc.typeText

Files