Problems in group testing estimation and design

Author/Creator

Author/Creator ORCID

Date

2018-01-01

Department

Mathematics and Statistics

Program

Statistics

Citation of Original Publication

Rights

Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Subjects

Abstract

Group testing, which includes any procedure in which units are tested in pools rather than individually, has been an active area of research in the statistical literature for over 70 years. Much of this research has been focused on the problem of estimation in which random variables representing some binary trait are pooled together to estimate the underlying Bernoulli parameter. The effective use of such procedures has been shown to lead to large reductions in terms of mean square error (MSE), resulting in more accurate estimates, or allowing for fewer tests to be carried out while maintaining a fixed level of MSE. Despite this, group testing estimation problems are equivalent to estimating a nonlinear function of the underlying parameter so that previous work has relied heavily on large sample methods to establish results. In practice, however, group testing problems will usually involve very small sample sizes so that such methods may be inappropriate. In this dissertations we explore several problems related to group testing estimation and design based on small sample methods. The first problem we consider is the construction of unbiased estimators. While the standard binomial model does not yield an unbiased estimator, we give a construction based on an inverse binomial model which samples until a fixed number of negative pools are observed. This is extended to include cases where misclassification errors are present, and we show that, while an unbiased estimator can be constructed, it is improper, yielding values outside the parameter space. This is extended to the entire class of binomial sampling plans when misclassification is present, showing that no proper unbiased estimator exists in this broad class. These ideas are extended again to the case of multinomial sampling, where we show that under any sampling plan it is impossible to find a proper unbiased estimator, even without misclassification. The next problem we consider is the estimation of two diseases simultaneously using group testing methods. No closed form MLE exists in this case, and numerical methods are difficult due to a high frequency of boundary estimates. We propose an EM algorithm based estimator and provide proofs of convergence, even on the boundary of the parameter space. Several closed form alternatives are also provided, primarily with the aim of bias reduction. The final problem we consider is that of choosing the group size for experiments when only a small number of tests can be carried out. Previous methods have relied heavily on good prior knowledge of the parameter value to be estimated, with the needed accuracy decreasing with the sample size. We propose simple random walk based adaptive procedures which minimize the need for such prior information. These designs are shown numerically to outperform the large sample based methods previously found in the literature. These methods are extended to the case when misclassification errors are present, with similar results.