Statistical Analysis of a Case-Control Alzheimer's Disease: A Retropective Approach with Su cient Dimension Reduction

Author/Creator ORCID

Date

2015

Department

Program

Citation of Original Publication

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

Alzheimer's Disease is a neurological disorder chiefly present in the elderly that affects functions of the brain such as memory and logic, eventually resulting in death. There is no known cure to Alzheimer's and evidence points to the possibility of a genetic link. This study analyzes microarray data from patients with Alzheimer's disease and disease-free patients in order to evaluate and determine differential gene expression patterns between the two groups. The statistical problem stemming from this data involves many predictor variables with a small sample size, preventing the use of classical statistical approaches from being effective. We turn to a novel three-step approach: first, we screen the genes in order to keep only the genes marginally related to the outcome (presence of Alzheimer's); second, we implemented a sparse sufficient dimension reduction to retain only predictors relevant to the outcome; lastly, we perform a hierarchical clustering method to group genes that exhibit mutual dependence. We adapted this methodology from Adragni et. al and expand on their work by optimizing the existing R code with parallel capabilities in order to enhance performance speed. Thus, our results reflect both an analysis of the microarray data and a performance study of the modified code.