A GROUP SEQUENTIAL MULTIPLE TESTING METHOD AND ITS APPLICATION TO GENOMIC DATA

dc.contributor.advisorBaek, Seungchul
dc.contributor.advisorPark, Junyong
dc.contributor.authorKim, Yewon
dc.contributor.departmentMathematics and Statistics
dc.contributor.programStatistics
dc.date.accessioned2022-09-29T15:38:21Z
dc.date.available2022-09-29T15:38:21Z
dc.date.issued2022-01-01
dc.description.abstractIn this dissertations, we consider the simultaneous testing of groups and hypotheses within the groups which occurs in many scientific problems. A group is commonly judged to be significant if at least one hypothesis within the group is significant which is implemented via a global test for complete null hypothesis. However, this null hypothesis for group significance is strict, so all groups tend to be rejected especially when the number of hypotheses within a group is large. To avoid such trivial hypothesis testing results, we introduce the concept of margin to multiple testing problems so that we can adjust different levels of significance of the group. Based on this idea, we propose a group sequential multiple testing method with controlling false discovery rate (FDR) which incorporates the margin for group significance. As real data applications, we apply the proposed method to functional groups of single nucleotide polymorphisms (SNPs). We select significantly associated pairs of the summary statistics from genome-wide association study (GWAS) and linkage disequilibrium (LD) score. We further investigate additional local associations within haplotype blocks while existing methods such as LD score regression (LDSC) uses the whole SNPs. Our findings provide different aspects of explanation on the associations between the summary statistics and LD score such as Simpson's paradox. In the second real data applications, we consider non-coding GWAS SNPs of regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). By partitioning the GWAS SNPs for type 2 diabetes into DHSs groups, we apply the proposed method to detect statistically associated DHSs groups with type 2 diabetes. Each of the 32 DHSs groups represents a unique organ, the group related to the pancreas is detected as a significant group even with a large margin, and the findings are consistent with the intuition and published articles.
dc.formatapplication:pdf
dc.genredissertations
dc.identifierdoi:10.13016/m2pzbm-mzlk
dc.identifier.other12537
dc.identifier.urihttp://hdl.handle.net/11603/26038
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Mathematics and Statistics Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Kim_umbc_0434D_12537.pdf
dc.subjectCombining p-values
dc.subjectFalse discovery rate
dc.subjectgenome-wide association study
dc.subjectlinkage disequilibrium score
dc.subjectTesting with margin
dc.titleA GROUP SEQUENTIAL MULTIPLE TESTING METHOD AND ITS APPLICATION TO GENOMIC DATA
dc.typeText
dcterms.accessRightsDistribution Rights granted to UMBC by the author.
dcterms.accessRightsAccess limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kim_umbc_0434D_12537.pdf
Size:
2.01 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Kim-Yewon_Open.pdf
Size:
15.08 MB
Format:
Adobe Portable Document Format
Description: