UMBC Mathematics and Statistics Department

Permanent URI for this collection


Recent Submissions

Now showing 1 - 20 of 483
  • Item
    A New Class of Optimal Designs in the Presence of a Quantitative Covariate
    (Univ. of Rajshahi, Bangladesh, 2015-04-12) Sinha, Bikas K.; Rao, P.S.S.N.V. P.; Mathew, Thomas; Rao, S. B.
    We propose to discuss at length the problem of placement of one controllable covariate in the context of an experiment involving several ‘treatments’. We do this while extracting maximum information on the unknown parameter attached to the covariate’s values in the mean model for the observations. The experimental set-up is a bit different, and this calls for an interesting non-trivial study on optimality in the context of a single-covariate linear regression model.
  • Item
    Soil moisture conditions alter behavior of entomopathogenic nematodes
    (Wiley, 2024-01-22) Frankenstein, Dana; Luu, Macawan S; Luna-Ayala, Jennifer; Willett, Denis S; Filgueiras, Camila S
    A variety of environmental factors can disrupt biotic interactions between plants, insects and soil microorgan-isms with consequences for agricultural management and production. Many of these belowground interactions are mediatedby volatile organic compounds (VOCs) which can be used for communication under appropriate environmental conditions.Behavioral responses to these compounds may likewise be dependent on varying soil conditions which are influenced by achanging climate. To determine how changing environmental conditions may affect VOC-mediated biotic interactions, we useda belowground system where entomopathogenic nematodes (EPNs)–tiny roundworm parasitoids of soil-borneinsects–respond to VOCs by moving through the soil pore matrix. Specifically, we used two genera of EPNs–HeterorhabditisandSteinernema–that are known to respond to four specific terpenes–⊍-pinene, linalool,D-limonene andpregeijerene–released by the roots of plants in the presence of herbivores. We assessed the response of these nematodesto these terpenes under three moisture regimes to determine whether drier conditions or inundated conditions may influencethe response behavior of these nematodes.RESULTS: Our results illustrate that the recovery rate of EPNs is positively associated with soil moisture concentration. As soilmoisture concentration increases from 6% to 18%, substantially more nematodes are recovered from bioassays. In addition,wefind that soil moisture influences EPN preference for VOCs, as illustrated in the variable response rates. Certain compoundsshifted from acting as a repellent to acting as an attractant and vice versa depending on the soil moisture concentration.CONCLUSION: On a broad scale, we demonstrate that soil moisture has a significant effect on EPN host-seeking behavior. EPNefficacy as biological control agents could be affected by climate change projections that predict varying soil moisture concen-trations. We recommend that maintaining nematodes as biological control agents is essential for sustainable agriculture devel-opment, as they significantly contribute not only to soil health but also to efficient pest management.© 2024 The Authors.Journal of The Science of Food and Agriculturepublished by John Wiley & Sons Ltd on behalf of Society ofChemical Industry.
  • Item
    Performance Benchmarking of Data Augmentation and Deep Learning for Tornado Prediction
    (IEEE, 2020-02-24) Barajas, Carlos A.; Gobbert, Matthias; Wang, Jianwu
    Predicting violent storms and dangerous weather conditions with current models can take a long time due to the immense complexity associated with weather simulation. Machine learning has the potential to classify tornadic weather patterns much more rapidly, thus allowing for more timely alerts to the public. To deal with class imbalance challenges in machine learning, different data augmentation approaches have been proposed. In this work, we examine the wall time difference between live data augmentation methods versus the use of preaugmented data when they are used in a convolutional neural network based training for tornado prediction. We also compare CPU and GPU based training over varying sizes of augmented data sets. Additionally we examine what impact varying the number of GPUs used for training will produce given a convolutional neural network.
  • Item
    Weak and Strong Solutions for A Fluid-Poroelastic-Structure Interaction via A Semigroup Approach
    (2024-01-08) Avalos, George; Gurvich, Elena; Webster, Justin
    A filtration system, comprising a Biot poroelastic solid coupled to an incompressible Stokes free-flow, is considered in 3D. Across the flat 2D interface, the Beavers-Joseph-Saffman coupling conditions are taken. In the inertial, linear, and non-degenerate case, the hyperbolic-parabolic coupled problem is posed through a dynamics operator on an appropriate energy space, adapted from Stokes-Lamé coupled dynamics. A semigroup approach is utilized to circumvent issues associated to mismatched trace regularities at the interface. C0-semigroup generation for the dynamics operator is obtained with a non-standard maximality argument. The latter employs a mixed-variational formulation in order to invoke the Babuška-Brezzi theorem. The Lumer-Philips theorem yields semigroup generation, and thereby, strong and generalized solutions are obtained. As the dynamics are linear, a standard argument by density obtains weak solutions; we extend this argument to the case where the Biot compressibility of constituents degenerates. Thus, for the inertial Biot-Stokes filtration, we provide a clear elucidation of strong and weak solutions, as well as their regularity through associated estimates.
  • Item
    Convergence of the mini-batch SIHT algorithm
    (Springer, 2024-01-10) Damadi, Saeed; Shen, Jinglai
    The Iterative Hard Thresholding (IHT) algorithm has been considered extensively as an effective deterministic algorithm for solving sparse optimizations. The IHT algorithm benefits from the information of the batch (full) gradient at each point and this information is a crucial key for the convergence analysis of the generated sequence. However, this strength becomes a weakness when it comes to machine learning and high dimensional statistical applications because calculating the batch gradient at each iteration is computationally expensive or impractical. Fortunately, in these applications the objective function has a summation structure that can be taken advantage of to approximate the batch gradient by the stochastic mini-batch gradient. In this paper, we study the mini-batch Stochastic IHT (SIHT) algorithm for solving the sparse optimizations. As opposed to previous works where increasing and variable mini-batch size is necessary for derivation, we fix the mini-batch size according to a lower bound that we derive and show our work. To prove stochastic convergence of the objective value function we first establish a critical sparse stochastic gradient descent property. Using this stochastic gradient descent property we show that the sequence generated by the stochastic mini-batch SIHT is a supermartingale sequence and converges with probability one. Unlike previous work we do not assume the function to be a restricted strongly convex. To the best of our knowledge, in the regime of sparse optimization, this is the first time in the literature that it is shown that the sequence of the stochastic function values converges with probability one by fixing the mini-batch size for all steps.
  • Item
    The Holistic Prioritized SATCOM Throughput Requirements (HPSTR) Stochastic Model
    (2023-12) Wesloh, Matthew; Douglas, Noelle; White, Brianne; Shallcross, Nicholas
    The U.S. Army's command and control modernization efforts rely upon an expeditionary, mobile, hardened, and resilient network. Dispersed network access and data availability are central to increasing the operational speed required for effective command and control. The Army must define its satellite communication (SATCOM) requirements to support network modernization. This paper proposes the Holistic Prioritized SATCOM Throughput Requirements (HPSTR) simulation that prioritizes and adjudicates SATCOM throughput requirements for operational military units. Additionally, the simulation evaluates the impact of a contested, degraded, and operationally limited (CDO) communication environment on force effectiveness. HPSTR addresses knowledge gaps concerning U.S. Army SATCOM activities in a large-scale combat operation (LSCO) to inform modernization decisions.
  • Item
    Inference about a Common Mean Problem in the Context of Univariate and Multivariate Settings
    (2023-01-01) Moluh, Alain; Yehenew, Kifle; Bimal, Sinha; Mathematics and Statistics; Statistics
    This Ph.D. thesis addresses the challenge of validating research hypotheses usingmultiple datasets and focuses on developing ecient testing methods with maximum discriminatory power to distinguish between true and false hypotheses. This task is particularly complex when hypotheses are closely positioned, presenting a demanding scenario. The thesis extensively explores strategies for amalgamating results from k independent studies that share a common goal, considering both univariate and multivariate common mean problems with limited information about dispersion parameters. Local powers of dierent synthesis methods are compared to ascertain their eectiveness. For the univariate common mean problem, explicit expressions for local power are derived for several exact tests, facilitating a comprehensive comparison. The investigation reveals that a uniform comparison of these tests, irrespective of unknown variances, is possible for equal sample sizes. Remarkably, the Inverse Normal p-valuebased exact test emerges as the most eective, and the Modied-F exact test exhibits a notable advantage among modied distribution-based tests. The study extends to the multivariate common mean problem, encompassing inference about a common mean vector from independent multivariate normal populations with unknown, possibly unequal dispersion matrices. An unbiased estimate of the common mean vector, along with its asymptotic estimated variance, is proposed for hypothesis testing and condence ellipsoid construction, applicable to large samples. Multiple exact test procedures and condence set construction techniques are meticulously explored, accompanied by a comparative analysis based on local power. The results reveal that, in the special case of equal sample sizes, local powers are directly comparable regardless of the unknown dispersion matrices, with Inverse Normal and Jordan-Kris methods exhibiting superior performance. The thesis also addresses scenarios where studies contain either univariate or bivariate features, requiring distinct statistical meta-analysis approaches. Methods for hypothesis testing in the common mean problem are demonstrated, considering various strategies for estimating between-studies variability parameters in cases where homogeneity assumptions are violated. The presented methods are illustrated using simulated and real datasets. In summary, this Ph.D. thesis contributes a comprehensive exploration of synthesis methods for validating research hypotheses in the context of univariate and multivariate common mean problems. The work showcases the eectiveness of speci c exact tests and synthesis approaches, providing valuable insights for conducting statistical meta-analysis across diverse scenarios.
  • Item
    Flexible Joint Models For Screening Studies
    (2023-01-01) Roy, Siddharth; Liu, Danping; Roy, Anindya; Mathematics and Statistics; Statistics
    In this dissertation, we study and develop statistical methods to jointly analyze longitudinal biomarkers with time-to-event outcomes motivated by risk assessment in cancer screening. Cancer screening studies collect longitudinal biopsies to allow the identification of additional longitudinal biomarker measurements for risk stratification. However, these studies present several challenges for current joint modeling approaches. In Chapter 2, we develop a dynamic risk prediction approach that links both continuous and binary biomarkers to the interval-censored precancer outcome with shared high dimensional random effects. A cancer screening dataset shows improved risk stratification compared to univariate joint models. In Chapter 3, we develop a latent health model for at-risk patients that can both deteriorate to case status and improve to low-risk status. We link the change in the health process to a longitudinal biomarker whose trajectory can change based on the event. We see that treating individuals who become risk-free as right censored and ignoring the event's impact on the biomarker trajectory can result in significantly biased risk estimates. In Chapter 4, we compare three common approaches to identify longitudinal biomarkers associated with survival outcomes: joint models, conditional models, and time-dependent Cox models. We use simulations to evaluate how well the methods identify and distinguish biomarkers that are useful for long-term risk assessment and early detection.
  • Item
    (2023-01-01) Nguyen, Luan; Kang, Hye-Won; Mathematics and Statistics; Mathematics, Applied
    Chemical reaction networks are used to describe manybiological processes including metabolic pathways. Due to interactions between different chemical species, we can see interesting dynamics of the chemical systems such as oscillations, spatial patterns, and self-assembly. In this dissertation, we investigate several chemical reaction networks in glucose metabolism and explore their interesting dynamic behaviors. First, we study the well-mixed glycolytic pathway involving two chemical species. The ODE model shows limit cycle behavior for some parameter values. We enlarge the glycolytic pathway so that we can control the limit cycle behavior. In addition, we also look at the stochastic dynamics of the enlarged network while varying certain parameter values. Next, we consider the spatially-distributed glycolytic pathway. The stochastic model for the glycolytic pathway shows interesting spatial patterns when the corresponding deterministic model exhibits the Turing instability. The compartment size in the stochastic model affects spatial pattern formation. Thus, we estimate the appropriate compartment size using the mean lifetime of chemical species. Last, we develop a stochastic model to describe the PFKL condensate formation using the Langevin dynamics. We find several key parameter values using numerical simulations of the stochastic model via LAMMPS.
  • Item
    (2023-01-01) Hore, Gaurab; Roy, Anindya; McElroy, Tucker S; Mathematics and Statistics; Statistics
    This dissertation addresses the critical issue of ensuring privacy while preserving data utility in the context of univariate and multivariate time series data. The conventional noise-addition privacy mechanism can distort autocorrelation patterns of time series, thus compromising utility. To overcome this limitation, in a recent paper by McElroy et al. (2023), a mechanism called FLIP has been introduced that uses all-pass filtering to obtain a privatized time series having second-order utility. FLIP has limitations in two key aspects: first, obtaining an all-pass filter becomes significantly more complex in the context of multivariate time series data; and second, it relies on estimation of the spectral density of the time series. To overcome the limitations of FLIP, this dissertation presents two novel approaches. First, we propose a multivariate all-pass (MAP) filtering method, employing an optimization algorithm to achieve the best balance between data utility and privacy protection. A model-agnostic data release mechanism is proposed in the next chapter, which ensures privacy by employing filtering with random coefficients. To preserve data utility, the method releases multiple copies of the time series perturbed using independent random filters. This approach eliminates the need for estimation of the spectral density, making it a more versatile solution for data-producing agencies. The practical performance of the proposed methods is rigorously assessed through numerical studies, including both simulated data and real data sourced from the U.S. Census BureauÕs Quarterly Workforce Indicator (QWI) dataset. By combining the insights from both chapters, this dissertation contributes to the evolving field of privacy mechanisms and data privatization for time series data, offering innovative solutions to the ever-growing challenges faced by data producers and curators.
  • Item
    Enhancing Real-Time Imaging for Radiotherapy: Leveraging Hyperparameter Tuning with PyTorch
    (2023) Baird, Kaelen; Kadel, Sam; Kaufmann, Brandt; Obe, Ruth; Soltani, Yasmin; Cham, Mostafa; Gobbert, Matthias; Barajas, Carlos A.; Jiang, Zhuoran; Sharma, Vijay R.; Ren, Lei; Peterson, Stephen W.; Polf, Jerimy C.
    Proton beam therapy is an advanced form of cancer radiotherapy that uses high-energy proton beams to deliver precise and targeted radiation to tumors, mitigating unnecessary radiation exposure to surrounding healthy tissues. Utilizing real-time imaging of prompt gamma rays can enhance the effectiveness of this therapy. Compton cameras are proposed for this purpose, capturing prompt gamma rays emitted by proton beams as they traverse a patient’s body. However, the Compton camera’s non-zero time resolution results in simultaneous recording of interactions, causing reconstructed images to be noisy and lacking the necessary level of detail to effectively assess proton delivery for the patient. In an effort to address the challenges posed by the Compton camera’s resolution and its impact on image quality, machine learning techniques, such as recurrent neural networks, are employed to classify and refine the generated data. These advanced algorithms can effectively distinguish various interaction types and enhance the captured information, leading to more precise evaluations of proton delivery during the patient’s treatment. To achieve the objectives of enhancing data captured by the Compton camera, a PyTorch model was specifically designed. This decision was driven by PyTorch’s flexibility, powerful capabilities in handling sequential data, and enhanced GPU usage, accelerating the model’s computations and further optimizing the processing of large-scale data. The model successfully demonstrated faster training performance compared to previous approaches and achieves an overall fair accuracy with so far limited hyperparameter tuning, highlighting its effectiveness in advancing real-time imaging of prompt gamma rays for enhanced evaluation of proton delivery in cancer therapy.
  • Item
    Exploring the Learning Potential of ELM from Finite Difference Solutions for Heat Equation
    (IEEE, 2023-11-27) Ahmad, Muhammad Jalil; Gunel, Korhan
    The study compares two methods, the finite difference and extreme learning machine (ELM), for solving the one-dimensional heat equation. The finite difference is a classical numerical method, while ELM is a machine learning-based approach that does not use a trial function to represent the solution. The results show that ELM can learns the finite difference. However, it should be noted that ELM faces challenges when directly solve the one-dimensional heat equation itself, as shown in the results. Despite the limitations observed in directly solving the one-dimensional heat equation with ELM, the study suggests that ELM still holds promise as a potentially viable alternative to classical numerical methods for solving partial differential equations (PDEs). Further research could explore incorporating optimization methods or employing a two-phase neural network, as proposed in our future work, to improve the accuracy of ELM's predictions for PDEs.
  • Item
    False Discovery Rate Controlling Procedures with BLOSUM62 substitution matrix and their application to HIV Data
    (2023-11-25) Kim, Kyurhi; Park, Junyong; Park, Dohwan; Giraldo, Mileiy; Aldunate, Muriel; Spouge, John L.; Tachedjian, Gilda
    Identifying significant sites in sequence data and analogous data is of fundamental importance in many biological fields. Fisher's exact test is a popular technique, however this approach to sparse count data is not appropriate due to conservative decisions. Since count data in HIV data are typically very sparse, it is crucial to use additional information to statistical models to improve testing power. In order to develop new approaches to incorporate biological information in the false discovery controlling procedure, we propose two models: one based on the empirical Bayes model under independence of amino acids and the other uses pairwise associations of amino acids based on Markov random field with on the BLOSUM62 substitution matrix. We apply the proposed methods to HIV data and identify significant sites incorporating BLOSUM62 matrix while the traditional method based on Fisher's test does not discover any site. These newly developed methods have the potential to handle many biological problems in the studies of vaccine and drug trials and phenotype studies.
  • Item
    Accelerating Real-Time Imaging for Radiotherapy: Leveraging Multi-GPU Training with PyTorch
    (2023-10-02) Obe, Ruth; Kaufmann, Brandt; Baird, Kaelen; Kadel, Sam; Soltani, Yasmin; Cham, Mostafa; Gobbert, Matthias; Barajas, Carlos A.; Jiang, Zhuoran; Sharma, Vijay R.; Ren, Lei; Peterson, Stephen W.; Polf, Jerimy C.
    Proton beam therapy is an advanced form of cancer radiotherapy that uses high-energy proton beams to deliver precise and targeted radiation to tumors. This helps to mitigate unnecessary radiation exposure in healthy tissues. Realtime imaging of prompt gamma rays with Compton cameras has been suggested to improve therapy efficacy. However, the camera’s non-zero time resolution leads to incorrect interaction classifications and noisy images that are insufficient for accurately assessing proton delivery in patients. To address the challenges posed by the Compton camera’s image quality, machine learning techniques are employed to classify and refine the generated data. These machine-learning techniques include recurrent and feedforward neural networks. A PyTorch model was designed to improve the data captured by the Compton camera. This decision was driven by PyTorch’s flexibility, powerful capabilities in handling sequential data, and enhanced GPU usage. This accelerates the model’s computations on large-scale radiotherapy data. Through hyperparameter tuning, the validation accuracy of our PyTorch model has been improved from an initial 7% to over 60%. Moreover, the PyTorch Distributed Data Parallelism strategy was used to train the RNN models on multiple GPUs, which significantly reduced the training time with a minor impact on model accuracy.
  • Item
    Completely mixed linear games corresponding to Z-transformations over self-dual cones
    (2023-10-20) Gowda, M. Seetharama
    In the setting of a self-dual cone in a finite dimensional inner product space, we consider (zero-sum) linear games. In our previous work, we showed that a Z-transformation with positive value is completely mixed. In the present paper, we consider the case when the value is zero. Motivated by the result (in the classical setting) that a Z-matrix with value zero is completely mixed if and only if it is irreducible, we formulate our general results based on the concepts of cone-irreducibility and space-irreducibility. In the setting of a symmetric cone (in a Euclidean Jordan algebra), we show that the space-irreducibility condition is necessary for a Z-transformation with value zero to be completely mixed and that it is sufficient when the Z-transformation is the difference of a Lyapunov-like transformation and a positive transformation. Additionally, we show that cone-irreducibility and space-irreducibility are equivalent for a positive transformation on a symmetric cone.
  • Item
    Mode switching in organisms for solving explore-versus-exploit problems
    (Nature, 2023-10-26) Biswas, Debojyoti; Lamperski, Andrew; Yang, Yu; Hoffman, Kathleen; Guckenheimer, John; Fortune, Eric S.; Cowan, Noah J.
    Trade-offs between producing costly movements for gathering information (‘explore’) and using previously acquired information to achieve a goal (‘exploit’) arise in a wide variety of problems, including foraging, reinforcement learning and sensorimotor control. Determining the optimal balance between exploration and exploitation is computationally intractable, necessitating heuristic solutions. Here we show that the electric fish Eigenmannia virescens uses a salience-dependent mode-switching strategy to solve the explore–exploit conflict during a refuge-tracking task in which the same category of movement (fore-aft swimming) is used for both gathering information and achieving task goals. The fish produced distinctive non-Gaussian distributions of movement velocities characterized by sharp peaks for slower, task-oriented ‘exploit’ movements and broad shoulders for faster ‘explore’ movements. The measures of non-normality increased with increased sensory salience, corresponding to a decrease in the prevalence of fast explore movements. We found the same sensory salience-dependent mode-switching behaviour across ten phylogenetically diverse organisms, from amoebae to humans, performing tasks such as postural balance and target tracking. We propose a state-uncertainty-based mode-switching heuristic that reproduces the distinctive velocity distribution, rationalizes modulation by sensory salience and outperforms the classic persistent excitation approach while using less energy. This mode-switching heuristic provides insights into purposeful exploratory behaviours in organisms, as well as a framework for more efficient state estimation and control of robots.
  • Item
    State and Parameter Estimation in Stochastic Dynamical Systems
    (2023-01-01) Yu, Mingkai; Rathinam, Muruhan; Mathematics and Statistics; Mathematics, Applied
    Stochastic dynamical systems arise in the modeling of intracellular biological processes driven by diffusion and reaction of molecules. While several fluorescence based techniques exist for partial measurements of these systems, it remains a challenge to estimate the full state and parameters of such systems from the partially observed data. In this thesis, we study two problems: inferring the diffusivity from raw Fluorescence Correlation Spectroscopy (FCS) data, and estimating state and parameters in partially observed chemical reaction networks. Fluorescence intensity could be modeled as a stochastic process governed by the motion of particles. The autocorrelation of the intensity has an analytical form as a function of time lag and diffusion coefficient, hence the diffusion coefficient could be fitted. We present a mathematical derivation of the autocorrelation function. We also derive a formula for the variance of the time average of the autocorrelation function in integral form, and give a closed-form upper bound. We examine several different approaches to fit the diffusivity via Monte Carlo simulations. Furthermore, we analyze the sensitivity of the diffusivity $D$ obtained via the process of nonlinear least squares fit. We also study chemical reaction networks modeled by a discrete-state continuous-time Markov process, where the state vector represents the copy number of the species. We consider two scenarios, one in which exact observations of some species are made continuously in time over a window and the other in which observations are made in snapshots of time. For the continuous in time observation problem, we derive equations for the conditional probability distribution of the unobserved states. We also provide a novel particle filter to compute the conditional probability distribution of the unobserved species. We also adapt our algorithm for the estimation of parameters and for a past state value based on observations up to a future time. For the filtering of stochastic reaction networks when some states are observed noiselessly in snapshots of time, we propose a targeting algorithm that ensures the filter process reaches the target state. We present a rigorous proof of our algorithm, discuss the choices we could make within the implementation of the algorithm, and use numerical examples to illustrate our algorithm.
  • Item
    Statistical Meta-Analysis: Air Pollution & Children’s Health
    (University of Rajshahi, 2011) Stanwyck, Elizabeth; Wei, Rong
    There have been numerous studies seeking to establish an association between air pollution and children’s adverse health outcomes, and the ultimate findings are often varied. A few studies found a statistically significant association between an increase in a specific pollutant and an adverse health effect among children, while others find a non-significant association between the same pair of variables. These conflicting results undermine confidence in the final conclusions, and this leads naturally to a novel application of the so-called statistical meta-analysis whose primary objective is to integrate or synthesize the findings from independent and comparable studies. In this paper we first review a recent statistical meta-analysis paper by Weinmayr et al. (2010) dealing with studies on the effects of NO₂ and PM₁₀ on some aspects of children’s health. In the second part of this paper, we conduct our own meta-analysis focusing on the association between children’s (binary) health outcomes (such as cough and respiratory symptoms) and four pollutants: PM₁₀, NO₂, SO₂, and O₃. While we find a statistically significant association with every pollutant, it turns out that for PM₁₀, NO₂, and SO₂, there is significant heterogeneity among the estimated effect sizes (odds ratios). Finally, we explore the techniques of meta-regression by incorporating distinct study features to meaningfully explain the heterogeneity.
  • Item
    CYTOSKELETON - Microtubule Dynamics in the Cell Cycle
    (QUBES, 2023-10-06) Pie, Hannah; Hoffman, Kathleen
    This module contains exercises designed to help upper-level cell biology students understand the dynamics of microtubule polymerization and depolymerization within the cell cycle and how cancer treatments influence this process through interpretation of graphical information, use of dimensional analysis, and comparison of rates of change.
  • Item
    Hitting a prime in 2.43 dice rolls (on average)
    (Enumerative Combinatorics and Applications, 2023-09-05) Malinovsky, Yaakov; Alon, Noga