Familywise error rate

You don't need to be Editor-In-Chief to add or edit content to WikiDoc. You can begin to add to or edit text on this WikiDoc page by clicking on the edit button at the top of this page. Next enter or edit the information that you would like to appear here. Once you are done editing, scroll down and click the Save page button at the bottom of the page.

Jump to: navigation, search

In statistics, familywise error rate (FWER) is the probability of making one or more false discoveries, or type I errors among all the hypotheses when performing multiple pairwise tests[1][2].

Classification of m hypothesis tests

The following table defines some random variables related to the m hypothesis tests.

# declared non-significant # declared significant Total
# true null hypotheses U V m0
# non-true null hypotheses T S mm0
Total mR R m

The m specific hypotheses of interest are assumed to be known, but the number of true null hypotheses m0 and of alternative hypotheses m1, are unknown. V is the number of Type I errors (hypotheses declared significant when they are actually from the null distribution). T is the number of Type II errors (hypotheses declared not significant when they are actually from the alternative distribution). R is an observable random variable, while S, T , U, and V are unobservable random variables.

In terms of random variables,

 \mathrm{FWER} = \Pr(V \ge 1), \,

or equivalently,

 \mathrm{FWER} = 1 -\Pr(V = 0).

What constitutes a family?

In confirmatory studies (i.e., where one specifies a finite number of a priori inferences), families of hypotheses are defined by which conclusions need to be jointly accurate or by which hypotheses are similar in content/purpose. As noted by Hochberg and Tamrane (1987), "If these inferences are unrelated in terms of their content or intended use (although they may be statistically dependent), then they should be treated separately and not jointly" (p. 6).

For example, one might conduct a randomized clinical trial for a new antidepressant drug using three groups: existing drug, new drug, and placebo. In such a design, one might be interested in whether depressive symptoms (measured, for example, by a Beck Depression Inventory score) decreased to a greater extent for those using the new drug compared to the old drug. Further, one might be interested in whether any side effects (e.g., hypersomnia, decreased sex drive, and dry mouth) were observed. In such a case, two families would likely be identified: 1) effect of drug on depressive symptoms, 2) occurrence of any side effects.

Thus, one would assign an acceptable Type I error rate, alpha, (usually .05) to each family and control for family-wise error using appropriate multiple comparison procedures. In the case of the first family, effect of antidepressant on depressive symptoms, pairwise comparisons among groups (here, there would be three possible comparisons) would be jointly controlled using techniques such as Tukey's Honestly Significant Difference (HSD) comparison procedure or a Bonferroni correction. In terms of the side effect profile, one would likely be interested in controlling for Type I error in terms of all side effects considered jointly so that decisions about the side effect profile would not be erroneously inflated by allowing each side effect and each pairwise comparison among groups to receive its own uncorrected alpha. By the Bonferroni inequality, allowing each side effect and comparison its own alpha would result in a Type I error of .05 * 3 side effects * 3 pairwise comparisons per side effect = 0.45 (i.e., 45% chance of making a Type I error). Thus, a more appropriate control for side effect family-wise error might divide alpha by three (.05/3 = .0167) and allocate .0167 to each side effect multiple comparison procedure. In the case of Tukey's HSD (a strong control multiple comparison procedure), one would determine the critical value of Q, the studentized range statistic, based on the alpha of .0167.

See also

References

  • Hochberg, Y., & Tamhane, A. C. (1987). Multiple comparison procedures. New York: Wiley.
  1. Shaffer J. P. Multiple Hypothesis Testing, Annual Review of Psychology, January 1995, Vol. 46, Pages 561-584 http://dx.doi.org/10.1146/annurev.ps.46.020195.003021
  2. Benjamini, Y., and Hochberg Y. (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". Journal of the Royal Statistical Society. Series B (Methodological) 57 (1), 289–300. [1]

WikiDoc Help Menu

Quick Start..

Editing basics

Advanced editing

Communicating your edits

Help Videos You Can Watch

Acknowledgement and Attribution Regarding Sources of Content

Some of the initial content on this page may be incorporated in part from copyleft sources in the public domain including wikis such as Wikipedia and AskDrWiki. Drug information for patients came from the The National Library of Medicine. Infectious disease information may have come from the Centers for Disease Control (CDC). Differential Diagnoses are drawn from clinicians as well as an amalgamation of 3 sources: 1.The Disease Database; 2. Kahan, Scott, Smith, Ellen G. In A Page: Signs and Symptoms. Malden, Massachusetts: Blackwell Publishing, 2004:3; 3. Sailer, Christian, Wasner, Susanne. Differential Diagnosis Pocket. Hermosa Beach, CA: Borm Bruckmeir Publishing LLC, 2002:7 .