The Simultaneous Assessment of Normality and Homoscedasticity in Some Linear Models

Author/Creator

Author/Creator ORCID

Date

2016-01-01

Department

Mathematics and Statistics

Program

Statistics

Citation of Original Publication

Rights

This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Distribution Rights granted to UMBC by the author.

Abstract

Model-based statistical inference typically relies on several assumptions concerning the underlying model. Thus the inference may not remain valid if the model assumptions are violated. In view of this, it is important to check if the data are consistent with the model, and model diagnostics are routinely performed as part of the data analysis. In the analysis of variance (ANOVA) methodology, linear models are typically used, and two of the crucial assumptions are normality and homoscedasticity. The assessment of normality, based on graphical methods or formal tests, is usually carried out under the homoscedasticity assumption. On the other hand, most tests for homoscedasticity are sensitive to the normality assumption. Thus it is highly desirable to have a methodology that can be used for the simultaneous assessment of normality and homoscedasticity. The present work develops such methodologies under some univariate fixed and random effects models, and under a bivariate fixed effects model. This is accomplished by embedding the normal distribution within the class of the so called smooth alternatives. A major advantage of such a formulation is that the class of possible alternatives depends on only a finite number of parameters. One can then think of developing a likelihood-based test procedure for simultaneously testing normality and homoscedasticity. It turns out that for such a testing problem, the score test is particularly convenient to implement. The specification of the smooth alternative involves orthogonal polynomials up to a specified order. Both Legendre polynomials and Hermite polynomials are used in the thesis in order to specify the smooth alternative. The order to be used to specify the smooth alternative can be data-driven; i.e., estimated from the data, after specifying an upper bound. The data are assumed to fall into different groups that may exhibit heteroscedasticity, and smooth alternatives are specified for the development of score tests that can be used for the simultaneous assessment of normality and homoscedasticity. While deriving the tests, consideration is given to the possibility of having a common smooth alternative across the different groups, and different smooth alternatives across the different groups. The models taken up include a general univariate fixed effects model, a bivariate fixed effects model, and the one-way random model with balanced or unbalanced data. In the case of the one-way random model, the problem addressed is the simultaneous assessment of normality of the random effects, normality of the error terms, and homoscedasticity. The score tests are developed using Legendre polynomial-based and Hermite polynomial-based smooth alternatives, and data driven choices are developed for determining the order of the polynomials. It turns out that the asymptotic chisquare null distribution associated with the score statistic is not always satisfactory, and the percentiles required to carry out the tests can be estimated by simulation. Computational aspects are addressed, and the results are illustrated with examples. Generalization of the smooth test idea to more general mixed or random effects models appears difficult, unless the data are balanced and only a test for normality is desired assuming homoscedasticity. The difficulties are highlighted and illustrated using a two-way random model with interaction and a two-way mixed model with interaction.