Subgroup analyses: how to avoid being misledBMJ 2007; 335 doi: https://doi.org/10.1136/bmj.39265.596262.AD (Published 12 July 2007) Cite this as: BMJ 2007;335:96
- John Fletcher, clinical epidemiologist
- BMJ, London WC1H 9JR
Subgroup analyses are regarded with some suspicion because they can be misleading and less reliable than analyses based on all the people included in the research design. This is a wise precaution when the comparison was not planned at the outset. But when subgroups are described in the protocol of the trial or review along with a stated hypothesis, these secondary analyses may be used to show true differences in effect or to illustrate applicability across patient subgroups. Three recently published BMJ papers, including one in this issue, provide examples of each of these types of subgroup analysis.1 2 3
In a trial that set out to examine the effect on birth weight of reduced caffeine intake during pregnancy, the overall analysis found little effect.1 The difference in birth weight between the women who had drunk caffeinated coffee and those who had drunk decaffeinated coffee was 16 g (95% confidence interval −40 g to 73 g).
However, a clinically important difference in birth weight of 263 g (97 g to 430 g) between the two groups was seen in women who smoked more than 10 cigarettes a day. This poses a problem for readers who need to judge whether babies born to women who both smoke and drink caffeinated coffee will have lower birth weight.
During a clinical trial it is usual to collect detailed information on patient characteristics as well as the specific outcome measures for the trial. This gives rise to the possibility of researchers performing many separate analyses in the hope that “something will turn up” that has a P value lower than 0.05. This approach to …