Adaptive local false discovery rate procedures for highly spiky data and their application RNA sequencing data of yeast SET4 deletion mutants

Date

2021-07-28

Department

Program

Citation of Original Publication

Ramos, M. L., Park, D., Lim, J., Park, J., Tran, K., Garcia, E. J., & Green, E. (2021). Adaptive local false discovery rate procedures for highly spiky data and their application RNA sequencing data of yeast SET4 deletion mutants. Biometrical Journal, 63, 1729– 1744. https://doi.org/10.1002/bimj.202000256

Rights

This is the peer reviewed version of the following article: Ramos, M. L., Park, D., Lim, J., Park, J., Tran, K., Garcia, E. J., & Green, E. (2021). Adaptive local false discovery rate procedures for highly spiky data and their application RNA sequencing data of yeast SET4 deletion mutants. Biometrical Journal, 63, 1729– 1744. https://doi.org/10.1002/bimj.202000256, which has been published in final form at https://doi.org/10.1002/bimj.202000256. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions. This article may not be enhanced, enriched or otherwise transformed into a derivative work, without express permission from Wiley or by statutory rights under applicable legislation. Copyright notices must not be removed, obscured or modified. The article must be linked to Wiley’s version of record on Wiley Online Library and any embedding, framing or otherwise making available the article or pages thereof by third parties from platforms, services and websites other than Wiley Online Library must be prohibited.

Subjects

Abstract

Chromatin dynamics are central to the regulation of gene expression and genome stability. In order to improve understanding of the factors regulating chromatin dynamics, the genes encoding these factors are deleted and the differential gene expression profiles are determined using approaches such as RNAsequencing. Here, we analyzed a gene expression dataset aimed at uncovering the function of the relatively uncharacterized chromatin regulator, Set4, in the model system Saccharomyces cerevisiae (budding yeast). The main theme of this paper focuses on identifying the highly differentially-expressed genes in cells deleted for Set4 (referred to as Set4∆ mutant dataset) compared to the wild type yeast cells. The Set4∆ mutant data produce a spiky distribution on the log fold changes of their expressions, and it is reasonably assumed that genes which are not highly differentially-expressed come from a mixture of two normal distributions. We propose an adaptive local false discovery rate (FDR) procedure, which estimates the null distribution of the log fold changes empirically. We numerically show that, unlike existing approaches, our proposed method controls FDR at the aimed level (0.05) and also has competitive power in finding differentially expressed genes. Finally, we apply our procedure to analyzing the Set4∆ mutant dataset.