Distributed Search Of Biological Databases Using Hadoop/Mapreduce

Fashola, Babatunde Olaide

Distributed Search Of Biological Databases Using Hadoop/Mapreduce

dc.contributor.advisor	Stojkovic, Vojislav
dc.contributor.author	Fashola, Babatunde Olaide
dc.contributor.department	Computer Science and Bioinformatics Program	en
dc.contributor.program	Master of Science	en
dc.date.accessioned	2018-04-27T15:07:01Z
dc.date.available	2018-04-27T15:07:01Z
dc.date.issued	2015
dc.description.abstract	The main goals of this thesis research were to: 1. Make a computational platform/environment for thesis research. 2. Develop a MapReduce search algorithm that employs the scalability of a Hadoop cluster and the MapReduce functionalities to make the search of a biological database faster. 3. Implement the MapReduce search algorithm using the Java programming language, and running the consequent Java application in a Hadoop multi-node cluster in the cloud. 4. Compare execution times of - The MapReduce search program - The serial search programs – Boyer-Moore Algorithm and Knuth-Morris-Pratt Algorithm 13 GB of downloadable GenBank data was processed over the Hadoop framework installed on a 12-node cluster comprised of the Amazon EC2 t2.micro instance types. The execution time of the distributed search program is 46% faster than the execution times of the serial programs. Hence, the present search algorithms used for accessing the biological databases can incorporate the MapReduce programming model to improve their performances
dc.genre	theses
dc.identifier	doi:10.13016/M23N20H2K
dc.identifier.uri	http://hdl.handle.net/11603/9937
dc.language.iso	en
dc.relation.isAvailableAt	Morgan State University
dc.rights	This item is made available by Morgan State University for personal, educational, and research purposes in accordance with Title 17 of the U.S. Copyright Law. Other uses may require permission from the copyright owner.
dc.subject	Bioinformatics	en
dc.subject	Amazon Web Services (Firm)	en
dc.subject	Computer science	en
dc.subject	Cloud computing	en
dc.title	Distributed Search Of Biological Databases Using Hadoop/Mapreduce
dc.type	Text

Collections

MSU Dissertations and Theses Collection

Distributed Search Of Biological Databases Using Hadoop/Mapreduce

Files

Collections