Browsing by Subject "Parallel Computing"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
Item Assessment of Simple and Alternative Bayesian Ranking Methods Utilizing Parallel Computing(2011) Raim, Andrew M.; Liu, Minglei; Neerchal, Nagaraj K.; Morel, Jorge G.; Allen, Samantha; Kirlew, Dorothy; Obetz, Neil T.; Wade, Derek; Albertine, April C.; Neerchal, Nagaraj K.; Klein, MartinThe U.S. Census Bureau (USCB) assists the federal government in distributing approximately $400 billion of aid by providing a complete ranking of the states according to certain criteria, such as average poverty level. It is imperative that this ranking be as accurate as possible in order to ensure the fairness of the allocation of funds. Currently, the USCB ranks states based on point estimates of their true poverty level. Dr. Klein and Dr. Wright of the USCB have compared the performance of this method against more sophisticated procedures in simulation trials, but have found that they do not consistently outperform the existing method. We investigate this phenomenon by revisiting some of these procedures, and we expand on this work to produce new ranking algorithms. We utilize parallel programming to expedite Dr. Kleinās procedures. In addition, we specify two new prior distributions on the population means ā using previous yearsā census data as well as regression. We discuss the results of our methods in conjunction with Klein and Wrightās corresponding simulation results. In our final report, we compare the performance of our techniques to that of the USCBās current method and show the resulting state ranks for each procedure.Item Dimensionality Reduction Using Sliced Inverse Regression in Modeling Large Climate Data(2016) Allison, Ross Flieger; Miller, Lois; Sykes, Danielle; Valle, Pablo; Popuri, Sai K.; Wijekoon, Nadeesri; Neerchal, Nagaraj K.; Mehta, AmitaPrediction of precipitation using simulations on various climate variables provided by Global Climate Models (GCM) as covariates is often required for regional hydrological assessment studies. We use a sufficient dimension reduction method to analyze monthly precipitation data over the Missouri River Basin (MRB). At each location, effective reduced sets of monthly historical simulated data from a neighborhood provided by MIROC5, a Global Climate Model, are rst obtained via a semi-continuous adaptation of the Sliced Inverse Regression, a su cient dimension reduction approach. These reduced sets are used subsequently in a modi ed Nadaraya-Watson method for prediction. We implement the method on a computing cluster and demonstrate that it is scalable. We observe a significant speedup in the runtime when implemented in parallel.Item Efficient Scientific Big Data Aggregation through Parallelization and Subsampling(2019-01-01) Kay, Savio SebastianKay, Savio Sebastian; Wang, Jianwu; Information Systems; Information SystemsABSTRACT Title of Document: EFFICIENT SCIENTIFIC BIG DATA AGGREGATION THROUGH PARALLELIZATION AND SUBSAMPLING Savio Sebastian Kay Master of Science, Information Systems, 2019 Directed By: Jianwu Wang Assistant Professor Department of Information Systems University of Maryland, Baltimore County In the various scientific research study, experiments related to atmospheric physics and satellite data administration, processing and manipulation does take a considerable amount of time and resources depending on the size of the project. Due to the tremendous amount of data existing even in an essential use case, computing information does take a longer time. It is in the cause of multiple variables included in the substantial scientific dataset sizes of the Satellite specific files. One of the methods scientific researcher and developers' approach is to use more resources to manage the significant data ingestion and manipulation along with process parallelization like file-level parallelization and or day-level parallelization. It drastically reduces the time taken to process data. However, the concept of subsampling is known to diminish the period to a shorter span, which is suitable for a lot of scientific study and experiments. In this theses, the procedure of subsampling has tested and proposed to be an approach to decrease processing time radically. Experimental results show the Xarray python package; a modern python framework provides enough support to process large volumes of data in a shorter period, which is suitable for the scientific research study. We process One Month of Satellite data which constitutes to be 8928 HDF files with the size of about 1.154TB (Terabytes) of information. It includes 8928 HDF files of MYD03 (357.23GB) and 8928 HDF files of MYD06_L2 (797.71GB) MODIS satellite datasets. We evaluate the cloud property variable by aggregating Level 2 data to Level 3 format, and we achieve this via two primary approaches of subsampling and parallel processing. Our research and experiments show along with parallel computing on multiple compute nodes through XArray & Dask; subsampling technique can reduce system execution time dramatically with little to no data loss in the final computed information. The code for the research and study can be found over at the GitHub account of 'saviokay' with the repository name 'masters-theses', it can be accessed via the link: https://github.com/saviokay/masters-theses .Item An Implementation of Binomial Method of Option Pricing using Parallel ComputingPopuri, Sai K.; Raim, Andrew M.; Neerchal, Nagaraj K.; Gobbert, Matthias K.The Binomial method of option pricing is based on iterating over discounted option payoffs in a recursive fashion to calculate the present value of an option. Implementing the Binomial method to exploit the resources of a parallel computing cluster is non-trivial as the method is not easily parallelizable. We propose a procedure to transform the method into an āembarrassingly parallelā problem by mapping Binomial probabilities to Bernoulli paths. We have used the parallel computing capabilities in R with the Rmpi package to implement the methodology on the cluster tara in the UMBC High Performance Computing Facility, which has 82 compute nodes with two quad-core Intel Nehalem processors and 24 GB of memory on a quad-data rate InfiniBand interconnect. With high-performance clusters and multi-core desktops becoming increasingly accessible, we believe that our method will have practical appeal to financial trading firms.Item Investigating the Use of pMatlab to Solve the Poisson Equation on the Cluster maya(2014) Swatski, SarahMany physical phenomena can be described by partial differential equations which can be discretized to form systems of linear equations. We apply the finite difference method to the Poisson equation with homogeneous Dirichlet boundary conditions, which yields a system of linear equations with a large sparse system matrix. We implement pMatlab code which utilizes the conjugate gradient method to solve this system. We do not recommend the use of pMatlab at this time as we find that it is very limited, its implementation is highly complex and the results are inconsistent.Item Parallel Performance Studies for an Elliptic Test problem on the Cluster maya 2013; Using 1-D and 2-D domain subdivisions(2014) Kalayeh, Kourosh M.One of the most important aspects of parallel computing is the communication between processes since it has tremendous impact on overall performance of this method of computing. Consequently, it is important to implement the parallel code in a way that communications between processes are taking place in a most efficient way. In this study we want to investigate the effect of domain subdivision, 1-D or 2-D, on performance of parallel computing. In this regard, the Poisson equation is solved as a test problem using fi nite difference method with both 1-D and 2-D domain subdivisions. Both aforementioned methods show good speedup. Although in most cases the grid-structured communication show slightly better performance, the overall performance of 2-D domain subdivision does not indicate the superiority of this method.Item Parameter Estimation for the Dirichlet-Multinomial Distribution(2011-05-20) Peterson, AmandaIn the 1998 paper entitled Large Cluster Results for Two Parametric Multinomial Extra Variation Models, Nagaraj K. Neerchal and Jorge G. Morel developed an approximation to the Fisher information matrix used in the Fisher Scoring algorithm for finding the maximum likelihood estimates of the parameters of the Dirichlet-multinomial distribution. They performed simulation studies comparing the results of the approximation to the results of the usual Fisher Scoring algorithm, for varying dimensions of the parameter vector. In this study, parallel computing in R is utilized to extend the previous simulation studies to larger dimensions. Additionally, the Fisher Scoring algorithm and the direct numerical maximization of the maximum likelihood are compared.Item Spatio-temporal analysis of precipitation data via a sufficient dimension reduction in parallel(American Statistical Association, 2016) Popuri, Sai K.; Allison, Ross Flieger; Miller, Lois; Sykes, Danielle; Valle, Pablo; Neerchal, Nagaraj K.; Adragni, Kofi P.; Mehta, Amita; Gobbert, Matthias K.Prediction of precipitation using simulations on various climate variables provided by Global Climate Models (GCM) as covariates is often required for regional hydrological assessment studies. In this paper, we use a sufficient dimension reduction method to analyze monthly precipitation data over the Missouri River Basin (MRB). At each location, effective reduced sets of monthly historical simulated data from a neighborhood provided by MIROC5, a Global Climate Model, are first obtained via a semi-continuous adaptation of the Sliced Inverse Regression, a sufficient dimension reduction approach. These reduced sets are used subsequently in a modified Nadaraya-Watson method for prediction. We implement the method on a computing cluster, and demonstrate that it is scalable. We observe a signficant speedup in the runtime when implemented in parallel. This is an attractive alternative to the traditional spatio-temporal analysis of the entire region given the large number of locations and temporal instances.Item Use of Operator Upscaling for Seismic Inversion: Computationally Feasible Forward and Adjoint Calculations(2008-10-28) Griffith, Sean M.L.; Minkoff, Susan E.; Mathematics and Statistics; MathematicsTo solve seismic inverse problems via the adjoint state method, we must be able to repeatedly solve both the wave equation and its adjoint efficiently. Operator upscaling applied to the wave equation imparts fine scale information to the coarse scale without requiring that we solve the full fine scale problem. We apply the algorithm to the stress-free form of the 3D elastic wave equation. This algorithm has two stages: first, we solve independent subgrid problems on the fine scale; second, we use these subgrid solutions to solve the coarse problem. Because the subgrid problems are independent, they can be solved via an embarrassingly parallel algorithm. Surprisingly, the most expensive part of the coarse grid solve is not assembling the mass matrix (which is time independent) but instead it is calculation of the load vector (which is time dependent). Thus we parallelize the load vector calculation for the coarse problem, as it dominates the time step. The most expensive parts of the algorithm (the subgrid solve and the coarse load vector calculation) exhibit near linear speedup. In the second half of the thesis we discuss using the adjoint state method to solve the seismic inverse problem. As the acoustic wave operator is self-adjoint, we chose to differentiate and then discretize the problem. The result is that the adjoint problem can be solved by the same upscaling method as the standard acoustic wave equation. The forward and adjoint upscaling algorithms differ only in the source terms and in the time stepping order.