LEARNING CURVE Year : 2019  Volume : 41  Issue : 1  Page : 99100 Multiple testing and protection against a type 1 (false positive) error using the Bonferroni and Hochberg corrections Chittaranjan Andrade Department of Psychopharmacology, National Institute of Mental Health and Neurosciences, Bangalore, Karnataka, India Correspondence Address: In a given study, if many related outcomes are tested for statistical significance, one or more outcomes may emerge significant at the P < 0.05 level not because they are truly significant in the population but because of chance. The larger the number of statistical tests performed, the greater the risk that some of the significant findings are significant because of chance. There are many ways to protect against such false positive or Type 1 errors. The simplest way is to set a more stringent threshold for statistical significance than P < 0.05. This can be done using either the Bonferroni or the Hochberg correction. Using the Bonferroni correction, 0.05 is divided by the number of statistical tests being performed and the result is set as the critical P value for statistical significance. Using the Hochberg correction, the P values obtained from the different statistical tests are arranged in descending order of magnitude, and each P value is assessed for significance against progressively more stringent levels for significance. The Bonferroni and Hochberg procedures are explained with the help of examples.
The Bonferroni Correction With this method, the value of P for statistical significance (conventionally, 0.05) is divided by the number of statistical tests performed. So, for the negative symptom outcomes, because there are two tests (one for PANSSN and one for SANS), P for statistical significance is set at 0.05/2 or 0.025. This means that the outcomes for PANSSN and SANS will be considered significant only if the P values associated with these tests are <0.025 instead of <0.05, as conventional. With regard to the cognitive outcomes, because there are five tests, for any of the five outcomes to be considered statistically significant, it should result in a P value that is <0.05/5; that is, <0.01. The Bonferroni correction is considered conservative; that is, it makes it quite difficult to obtain statistically significant results. This is because when the number of tests performed is large, the P value required for statistical significance becomes quite small and is hard to achieve. In other words, the Bonferroni correction magnifies the risk of a false negative or Type 2 statistical error.[1] The Hochberg sequential procedure offers a better balance between the Type 1 and Type 2 error risks. The Hochberg Sequential Procedure With this method, after the groups are compared on each of the five cognitive outcomes, the P values obtained are arranged in descending order of magnitude. If the outcome with the largest P value is significant at the 0.05 level (i.e., P < 0.05), then all the outcomes are considered significant. If the first P value is >0.05, then the second P value is examined; if the second P value is <0.05/2 (that is, 0.025), then this outcome and all the outcomes with smaller P values are considered significant. If the second P value is >0.025, then the third P value is examined; if the third P value is <0.05/3 (that is, 0.017), then this outcome and all the outcomes with smaller P values are considered significant; and so on. For the negative symptom outcomes, if the larger of the two P values is <0.05, then both outcomes are considered significant. If the larger value is >0.05, the second P value will be considered significant only if it is <0.05/2; that is, 0.025. Effectively, the Hochberg sequential procedure applies progressively more stringent criteria for statistical significance, and the last P value is examined at the Bonferroni correction level if the previous P values were not significant on Hochberg testing. Notes Corrections for a Type 1 statistical error are necessary only when many tests of the same construct (e.g., cognition) are conducted. Correction is generally considered unnecessary if different tests examine different constructs (e.g., psychosis, memory, and extrapyramidal symptoms). However, in such a context, the issue of primary outcome vs secondary outcomes must be considered[3]Avoidance of a Type 1 error is desirable in confirmatory studies but may be dispensed with in exploratory studies where authors do not wish to miss a potentially significant outcomeSometimes, authors may set an arbitrarily conservative P value (e.g., P < 0.01) for all tests to modestly protect against a Type 1 error.[4] Financial support and sponsorship Nil. Conflicts of interest There are no conflicts of interest. References


