# Interaction revisited: the difference between two estimates

BMJ 2003; 326 doi: https://doi.org/10.1136/bmj.326.7382.219 (Published 25 January 2003) Cite this as: BMJ 2003;326:219## All rapid responses

*The BMJ*reserves the right to remove responses which are being wilfully misrepresented as published articles.

Altman and Bland [1] provide a method for determining whether an

observed difference between the proportionate reductions in risk an

intervention achieves for two different groups is statistically

significant. They use as an example a situation where a meta-analysis of

studies of the effect of hormone replacement therapy on non-vertebral

fractures showed a larger proportionate reduction in risk among women

under age 60 (RR = .67) than among women over age 59 (RR = .88).[2]

But in keeping with the standard approach to identifying interaction

or subgroup effects on the basis of different proportionate changes in

base outcome rates, the authors failed to consider the implications of the

different base rates for the two age groups. For reasons inherent in the

shapes of normal risk distributions, a factor that reduces the risk of an

outcome will tend to reduce it to a larger proportionate degree in the

group with the lower base rate while increasing the opposite outcome to a

larger proportionate degree in the other group.[3-7] The pattern is most

easily illustrated in test score data, which show how lowering a cutoff,

or improving performance, will tend to cause a larger proportionate

decrease in the failure rate of the group with the lower base failure rate

while causing a larger proportionate increase in the pass rate of the

other group. While neither the data discussed by Altman and Bland, nor the

data in the study from which Altman and Bland derived their figures,

include the base fracture rates for different age groups, presumably such

rates are lower among the younger group. The absence of data on base

rates for the different age groups also precludes determining whether,

consistent with the distributionally-based tendencies, the intervention

increased the rates of avoiding fractures proportionately more in the

older group. In any case, when groups have different base rates, risk

ratios that differ in accordance with the above-described patterns should

not be regarded as reflecting meaningful interactive effects even when the

difference between the risk ratios is statistically significant.

As discussed in sources cited by Sackett,[8] there exists some

evidence that interventions will tend to effect larger proportionate

reductions among groups with higher base rates. Reference 7 discusses

some of the possible reasons for the existence of such pattern, assuming

it exists, notwithstanding the distributionally-based tendencies that

militate in the opposite direction. These include the pattern of

regression toward the mean addressed in a 1996 article here by Sharp et

al.[8] (though I am uncertain of its role when the groups with different

base rates are defined by age or a like factor where each group would seem

to have its own risk distribution). Such issues, however, go to whether

it is possible to identify meaningful interactions even when different

base rates are taken into account. They do not provide a basis to ignore

the implications of differing base rates or the fundamental problem with

the risk ratio as a measure of association. That problem is most

succinctly illustrated when the risk ratio yields opposite interpretations

as to the size of two effects depending on whether one examines one

outcome or its opposite. But the problem is ever present.

In order to identify meaningful interactions, or for that matter to

quantify any effect size involving a dichotomy, one must employ a measure

that is unaffected by different base rates, such as that discussed in

references.5-7. There would, of course, remain issues as to whether

observed differences between effect sizes are statistically significant

and it may be necessary to devise techniques for addressing such issues.

References:

1. Altman DG, Bland JM. Interaction revisited: the difference

between two estimates. BMJ 2003;326:219.

2. Torgeson DJ, Bell-Syer SEM. Hormone replacement therapy and

prevention of non-vertebral fractures. A meta-analysis of randomized

trials. JAMA 2001;285:2891-97.

3. Scanlan JP. Race and mortality. Society 2000;37(2):19-35:

http://www.jpscanlan.com/images/Race_and_Mortality.pdf (Accessed 20 Sept

2010).

4. Scanlan JP. Divining difference. Chance 1994;7(4):38-9,48:

http://jpscanlan.com/images/Divining_Difference.pdf (Accessed 20 Sept

2010).

5. Scanlan JP. Interpreting Differential Effects in Light of

Fundamental Statistical Tendencies, presented at 2009 Joint Statistical

Meetings of the American Statistical Association, International Biometric

Society, Institute for Mathematical Statistics, and Canadian Statistical

Society, Washington, DC, 1-6 Aug. 2009:

http://www.jpscanlan.com/images/Scanlan_JSM_2009.ppt (Accessed 20 Sept

2010).

6. Scanlan JP. Rethinking the premises of subgroup analyses. BMJ

June 7, 2010 (responding to Sun X, Briel M. Walter SD, and Guyatt GH. Is

as subgroup effect believable? Updating criteria to evaluated the

credibility of subgroup analyses. BMJ 2010;340:850-854):

http://www.bmj.com/content/340/bmj.c117.extract/reply#bmj_el_236744

(Accessed 20 Sept 2010).

7. Subgroup Effects sub-page of Scanlan's Rule page of

jpscanlan.com: http://www.jpscanlan.com/scanlansrule/subgroupeffects.html

(Accessed 20 Sept 2010).

8. Sharp SJ, Thompson SG, and Altman DG. The relation between

treatment benefit and underlying risk in meta-analysis. BMJ 1996;313:735-

738.

**Competing interests: **
No competing interests

We want to know the best intervention to prevent disease, or, failing that, the best treatment for each disease. For this, we have to compare treatment effects derived from separate randomized controlled trials. In this regard, Altman and Bland (1) explain neatly the method to compare two relative risks derived from two separate analyses. Authors, however, do not provide for a situation comparing relative risks to odds ratios derived from separate analyses. It is a possibility worth considering, because odds ratio has become widely used in randomized controlled trials.

Supposing both odds ratio and relative risk derived from the same data, odds ratio is always greater than relative risk when both of them are greater than one, and less than relative risk when both of them are less than one (2). Therefore, considering two closely matching relative risk and odds ratio derived from two separate analyses, both treatment effects may be significantly different. In this respect, let's suppose results from two randomized controlled trials aimed at decreasing postmenopausal breast cancer risk (table). Whereas the relative risk for postmenopausal breast cancer using treatment 1 is (346/2,392)/(1,050/2,170) = 0.30 (95%CI: 0.27-0.33), the odds ratio with treatment 2 is (400/2,168)/(875/1,346) = 0.28 (95%CI: 0.25-0.33), which is very similar to the relative risk with treatment 1. Apparently, thus, either treatment is similarly effective to decrease postmenopausal breast cancer risk.

However, if instead of odds ratio, relative risk is calculated from study 2 data, we obtain (400/2,568)/(875/2,221) = 0.40 (95%CI: 0.36-0.44). By using the test of interaction reported by Altman and Bland (1), z = -3.68 (P¡Ö0.001). Therefore, both treatment effects are significantly different, being treatment 1 the best for reducing postmenopausal breast cancer risk.

Therefore, although both relative risk and odds ratio derived from two separate analyses are closely matched, the possibility that both treatment effects are significantly different should not be excluded. In order to avoid this confusion, and due to the problems posed by odds ratios (2,3), it is preferable to report results derived from randomized controlled trials in terms of relative risks rather than odds ratios.

References

1 Altman DG, Bland JM. Interaction revisited: the difference between two estimates. BMJ 2003; 326: 219.

2 Davies HTO, Crombie IK, Tavakoli M. When can odds ratios mislead? BMJ 1998; 316: 989-991.

3 Deeks J, Bracken MB, Sinclair JC, Davies HTO, Tavakoli M, Crombie IK. When can odds ratios mislead? BMJ 1998; 317: 1155.

TABLE. Two supposed randomized controlled trials aimed at decreasing postmenopausal breast cancer risk. Study 1 Postmenopausal breast cancer Total N Yes No Treatment 1 346 2046 2392 Control 1050 1120 2170 1396 3166 4562 Study 2 Treatment 2 400 2168 2568 Control 875 1346 2221 1275 3514 4789

TABLE. Two supposed randomized controlled trials aimed at decreasing postmenopausal breast cancer risk. Study 1 Postmenopausal breast cancer Total N Yes No Treatment 1 346 2046 2392 Control 1050 1120 2170 1396 3166 4562 Study 2 Treatment 2 400 2168 2568 Control 875 1346 2221 1275 3514 4789

Competing interests:

None declared

**Competing interests: **
TABLE. Two supposed randomized controlled trials aimedat decreasing postmenopausal breast cancer risk. Study 1 Postmenopausal breast cancer Total N Yes No Treatment 1 346 2046 2392Control 1050 1120 2170 1396 3166 4562Study 2 Treatment 2 400 2168 2568Control 875 1346 2221 1275 3514 4789

**05 February 2003**

## Re: Interaction revisited: the difference between two estimates

This follows on a September 2010 comment [1] on the Altman and Bland Statistics Note [2] that provided a formula for calculating the likelihood that a seeming interaction – also termed effect heterogeneity or subgroup effect – occurred by chance. The purpose of this comment is to show why the very concept of interaction, as commonly understood, is illogical, showing as well why such interaction is inevitable.

The premise underlying the formula presented by Altman and Bland (which formula can now be applied with an online calculator [3]), is that, absent interaction, an intervention will show the same relative risk (RR) for two groups even when the groups’ baseline rates are different. That is, if an intervention reduces a baseline rate from 20% to 10% (RR=.5) for Group A, absent interaction, one would expect the intervention to reduce Group B’s baseline rate of 10% to 5%. Tests such as that presented by Altman are aimed at determining, for example, where the 10% rate is instead reduced to 3%, how likely it is that such reduction (RR=.3) would occur by chance if the relative risks for the two groups are in fact the same.

But assume that in fact we observe the same .5 relative risk in both situations. There would be no question of interaction and, regardless of the confidence intervals that would go into the Altman/Bland formula, the z-score would be 0.

If that occurred, however, the relative risks for the opposite outcome would be different for the two groups. Group A’s relative risk would be 1.125 (80% increased to 90%) and Group B’s would be 1.055 (90% increased to 95%). This holds with any pair of baseline rates. That is, it is not possible for different baseline rates to change by equal proportionate amounts unless the opposite outcomes change by different proportionate amounts. Since there is no more reason to expect that the two groups will experience equal proportionate changes in one outcome than in the opposite outcome, it is illogical to think it somehow normal that that they will experience equal proportionate changes in either outcome.

Further, features of normal distributions of risk provide reason to expect that a factor that similarly affects two groups with different baseline rates for an outcome will effect a larger proportionate change for the group with the lower baseline rate while effecting a larger proportionate change in the opposite outcome for the other group.[4-6] Notably, the example used by Altman and Bland, from a meta-analysis by Torgerson and Bell-Syer [7] of the effect of hormone replacement therapy on nonvertebral fractures among women, showed a larger relative risk reduction in women under age 60 (RR=.67) than women over age 59 (RR=.88). Given that younger women would have lower baseline rates, the pattern conforms to the expectations based on the distributional forces just described.

Proceeding from a perspective where the relative risks are assumed to be equal absent a sound showing that they are not, Altman and Bland, based on the calculation of a z-score of 1.24 ( p=0.2), concluded that “[t]here is no good evidence to support a different treatment effect of younger and older women.” But the absence of any reason to expect (indeed, given that it is illogical to expect) that relative treatment effects would be the same for different baseline rates, and sound reason to believe that the relative effect would be larger for the group with the lower baseline rate, there is compelling reason to believe that the relative effect in fact is larger for younger women.

The practical implications of assumptions about effects across different baseline rates seem greatest when it is necessary to estimate, on the basis of a risk reduction observed as to one baseline rate, the clinically crucial absolute risk reduction for other baseline rates. The most plausible approach to that problem is to derive from the observed rates for treatment and control groups the difference between means of the hypothesized underlying distributions and use that difference to estimate absolute risk reductions for the baseline rates of concern. Results of such approach are illustrated in Table 3 (Method 1) of reference 8, which also shows the varying absolute risk reductions that would be estimated on the basis of assumptions of equal proportionate changes in one outcome (Method 2) or equal proportionate changes in the opposite outcome (Method 2 alt).

The assumption underlying Method 1 is that an intervention will shift the risk distributions of each group an equivalent amount. While the applicability of the assumption to a particular setting may be questioned, it at least has a rational basis. The assumption of equal proportionate risk reductions does not. Indeed, it is fairer to assume that the reductions always will be different (save on the rare occasion when a meaningful interaction, by happenstance, causes them to coincide) than that they typically will be the same.

See the penultimate paragraph of reference 9 regarding an assumption of proportionate changes in odds ratios. That item also explains why it is essential that studies present underlying rates, a point implicit in the discussion above.

References:

1. Scanlan JP. Problems in identifying interaction where groups have different base rates. BMJ 21 Sept 2010: http://www.bmj.com/content/326/7382/219/reply#bmj_el_241943

2. Altman DG, Bland JM. Interaction revisited: the difference between two estimates. BMJ 2003;326:219.

3. http://www.hutchon.net/CompareRR.htm

4. Scanlan JP. Divining difference. Chance 1994;7(4):38-9,48: http://jpscanlan.com/images/Divining_Difference.pdf

5. Scanlan JP. Race and mortality. Society 2000;37(2):19-35: http://www.jpscanlan.com/images/Race_and_Mortality.pdf

6. Interpreting Differential Effects in Light of Fundamental Statistical Tendencies, presented at 2009 Joint Statistical Meetings of the American Statistical Association, International Biometric Society, Institute for Mathematical Statistics, and Canadian Statistical Society, Washington, DC, Aug. 1-6, 2009: http://www.jpscanlan.com/images/JSM_2009_ORAL.pdf

7. Torgerson DJ, Bell-Syer SEM. Hormone replacement therapy and prevention of nonvertebral fractures. JAMA 2001;285:2891-7.

8. Subgroup Effects sub-page of Scanlan’s Rule page of jpscanlan.com:

http://www.jpscanlan.com/scanlansrule/subgroupeffects.html

9. Scanlan JP. Ratio measures are not transportable. BMJ Nov. 11, 2011 (responding to Schwartz LS, Woloshin S, Dvorin EL, Welch HG. Ratio measures in leading medical journals: structured review of underlying absolute risks. BMJ 2006;333:1248-1252): http://www.bmj.com/content/333/7581/1248?tab=responses

Competing interests:No competing interests19 December 2011