This book establishes the theoretical foundations of a general methodology for multiple hypothesis testing and discusses its software implementation in R and SAS. The methods are applied to a range of testing problems in biomedical and genomic research, including the identification of differentially expressed and co-expressed genes in high-throughput gene expression experiments, such as microarray experiments; tests of association between gene expression measures and biological annotation metadata (e.g., Gene Ontology); sequence analysis; and the genetic mapping of complex traits using single nucleotide polymorphisms.
The book is aimed at both statisticians interested in multiple testing theory and applied scientists encountering high-dimensional testing problems in their subject matter area. Specifically, the book proposes resampling-based single-step and stepwise multiple testing procedures for controlling a broad class of Type I error rates, defined as tail probabilities and expected values for arbitrary functions of the numbers of Type I errors and rejected hypotheses (e.g., false discovery rate). Unlike existing approaches, the procedures are based on a test statistics joint null distribution and provide Type I error control in testing problems involving general data generating distributions (with arbitrary dependence structures among variables), null hypotheses, and test statistics. The multiple testing results are reported in terms of rejection regions, parameter confidence regions, and adjusted p-values.
Multiple hypothesis testing.- Test statistics null distribution.- Overview of multiple testing procedures.- Single-step multiple testing procedures for controlling general Type I error rates.-Step-down multiple testing procedures for controlling the family-wise error rate.- Augmentation multiple testing procedures for controlling generalized tail probability error rates.- Resampling-based empirical Bayes multiple testing procedures for controlling generalized tail probability error rates.- Simulation studies: Assessment of test statistics null distributions.- Identification of differentially expressed and co-expressed genes in high-throughput gene expression experiments.- Multiple tests of association with biological annotation metadata.- HIV-1 sequence variation and viral replication capacity.- Genetic mapping of complex human traits using single nucleotide polymorphisms: The ObeLinks Project.- Software implementation.
From the reviews: "This book summarizes the recent work of Sandrine Dudoit and Mark van der Laan on multiple testing. It proposes a general framework for multiple testing procedures (MTPs) and introduces new concepts ! . The authors also provide code for reproducing the results of some of the applications. ! if one is looking for a detailed summary of the latest developments in multiple testing regarding MTPs or in the application of MTPs to biomedical and genomic data, then this book is an excellent reference." (Holger Schwender, Statistical Papers, Vol. 50, 2009) "In the last decade a growing amount of statistical research has been devoted to multiple testing. This book summarizes the recent work on this area. ! very useful for the applied researcher who would like to understand how to apply multiple testing. ! a good reference for statisticians interested in a general treatment of multiple testing." (Avner Bar-Hen, Mathematical Reviews, Issue 2009 j)