Statistical Learning: 13.2 Introduction to Multiple Testing and Family Wise Error Rate | Summary and Q&A
TL;DR
Testing multiple hypotheses can lead to an increased chance of making Type I errors, which can be problematic in situations with a large number of tests.
Key Insights
- 🏆 Testing multiple hypotheses becomes more challenging as the number of tests increases, leading to a higher chance of Type I errors.
- 🏆 Rejection of null hypotheses based on p-values below a threshold can result in numerous false positives in situations with a large number of tests.
- ☠️ Controlling the family-wise error rate, defined as the probability of making at least one Type I error, becomes increasingly difficult with a higher number of hypothesis tests.
- 💉 The relationship between p-values and Type I errors highlights the need for adjusting statistical analyses to account for multiple testing and avoid false positives.
- 🪙 The thought experiment with flipping coins illustrates how even with fair coins, multiple tests can lead to false rejections of null hypotheses.
- 🌥️ Reproducibility issues in scientific studies can be influenced by the large number of hypothesis tests conducted without proper correction for multiple testing.
- 🎮 The family-wise error rate provides a measure of controlling Type I errors but becomes harder to control as the number of tests increases.
Transcript
now we're going to move on to the topic for this chapter which is dealing with multiple hypothesis tests so now the situation is that we don't have just one hypothesis or one null hypothesis we want to test but m different ones and as i mentioned at the start we've had the sort of concept and statistics forever but the typical value for m was very ... Read More
Questions & Answers
Q: Why does testing multiple hypotheses become more complicated with a large number of tests?
When there are thousands or tens of thousands of hypothesis tests, the chances of making Type I errors increase significantly. The methods used for adjusting in situations with a small number of tests may not work effectively in these cases.
Q: How do p-values play a role in determining whether to reject a null hypothesis?
If the p-value falls below a certain threshold, typically set at 1 percent, the null hypothesis is rejected. However, when there are multiple tests, this can lead to a higher chance of false positives.
Q: What is the significance of the thought experiment involving flipping coins?
The thought experiment demonstrates the concept of p-values and Type I errors. Even with 1024 fair coins, flipping each coin 10 times would result in at least one coin with a p-value below 0.002, leading to a false rejection of the null hypothesis.
Q: How does the concept of multiple testing relate to reproducibility issues?
Performing a large number of hypothesis tests increases the likelihood of finding false positives. This can lead to exaggerated claims and difficulties in reproducing the results, contributing to reproducibility issues.
Summary & Key Takeaways
-
Testing multiple hypothesis tests becomes more challenging when dealing with a large number of tests.
-
Rejecting all null hypotheses with p-values below a certain threshold can result in a significant number of Type I errors.
-
Performing a large number of hypothesis tests increases the likelihood of false positives, even with a small p-value cutoff.