Some Approximate Confidence Intervals and Regions for Inter-laboratory Data Analysis

Author/Creator

Author/Creator ORCID

Date

2016-01-01

Type of Work

Department

Mathematics and Statistics

Program

Statistics

Citation of Original Publication

Rights

This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Distribution Rights granted to UMBC by the author.

Abstract

The basic problem of interest in inter-laboratory studies is inference concerning a common mean, also referred to as a consensus mean, when the measurements are obtained by several laboratories. The measurements from the different laboratories exhibit different within-laboratory variances, in addition to a between laboratory variability. In order to capture such variabilities, a natural choice of a model for the data is a one-way random model with heteroscedastic error variances. The topic of inter-laboratory data analysis has received considerable attention in the literature, and statistical inference under such a heteroscedastic one-way random model has been investigated in a number of articles. While likelihood based methods are a natural choice for the statistical inference, it is clearly important to look for methodologies that are accurate in small sample size scenarios. The topic of higher order asymptotics deals with modifications of the usual likelihood ratio procedures so as to achieve accurate small sample performance. In the thesis, a higher order asymptotic procedure is applied for the interval estimation of the consensus mean in inter-laboratory studies. In addition, the interval estimation of the inter-laboratory variance component is also addressed, even though this is a problem of secondary interest only. Numerical results are reported to show that the proposed solutions are accurate in terms of maintaining the coverage probability. For computational simplicity, the actual likelihood was not used in the development of some of the confidence intervals; rather, a simplified version was used by replacing the within-laboratory variances with the corresponding sample variances, so that the only unknown parameters in the likelihood are the consensus mean and the between-laboratory variance component. It is also noted that the Wald statistic can be used to obtain accurate confidence intervals for the consensus mean, provided a parametric bootstrap distribution is used (instead of the usual normal approximation). The results are all illustrated with examples. A feature of analytical measurements is that they may exhibit increasing measurement variation with increasing analyte concentrations. This property cannot be captured using a model that is linear with respect to the concentrations. While a log-linear model can capture this feature, such a model will fail in terms of explaining near-constant measurement variation at low concentration levels. In view of this, in order to accommodate increasing measurement variation with increasing analyte concentrations, and near-constant measurement variation at low concentration levels, a two-component measurement error model was introduced by Rocke and Lorenzato in a 1995 Technometrics article. The model is now referred to as the Rocke-Lorenzato model, and their work attracted a lot of attention. The original Rocke-Lorenzato model is for measurements made at a single laboratory; later researchers have extended the model to multi-laboratory scenarios. In the thesis, the single laboratory model, and its multi-laboratory generalizations are taken up, and accurate statistical inference is developed for various parameters of interest. A heteroscedastic within-laboratory variance scenario is also investigated; a situation that has not been investigated in the literature in the context of the Rocke-Lorenzato model. An important problem of interest in this context is the calibration problem; i.e., the interval estimation of an unknown analyte concentration after obtaining the corresponding responses. All of the relevant inference problems are addressed in the dissertations using the modified likelihood ratio procedure and also the Wald statistic with a parametric bootstrap. Accuracy of the proposed solutions is assessed using estimated coverage probabilities. The last topic discussed in the thesis is a multivariate generalization of the heteroscedastic one-way random model. The problems addressed include the computation of a confidence region for the consensus mean vector, and the estimation of the inter-laboratory variance component matrix. A full likelihood based analysis appears to be computationally challenging. For the point estimation of the inter-laboratory variance component matrix, a simple unbiased estimator is first considered, and then the estimator is modified using appropriate shrinking so as to get an improved estimator in terms of mean squared error. For computing a confidence region for the consensus mean vector, some solutions are obtained by following a ``likelihood type" approach, with two modifications: (i) a simplified likelihood function is used by replacing the within-laboratory variance-covariance matrices with the corresponding sample counter parts, and (ii) a parametric bootstrap approach is used to obtain the required percentile for obtaining the confidence region. It turns out that with these modifications, accurate and computationally tractable confidence regions can be obtained for the consensus mean vector. Numerical results are reported on the coverage probabilities, and illustrative examples are given based on real data as well as simulated data. Furthermore, plots are given for the confidence regions in the bivariate and trivariate cases. SAS codes are available for implementing all the confidence intervals and confidence regions developed in the thesis.