Generalisation and extrapolation of study resultsBMJ 2013; 346 doi: https://doi.org/10.1136/bmj.f3022 (Published 10 May 2013) Cite this as: BMJ 2013;346:f3022
- Philip Sedgwick, reader in medical statistics and medical education
- 1Centre for Medical and Healthcare Education, St George’s, University of London, Tooting, London, UK
Researchers assessed the effectiveness of peritendinous autologous blood injections in patients with mid-portion Achilles tendinopathy. A randomised double-blind controlled trial was performed. The intervention consisted of two unguided peritendinous injections with 3 mL of the patient’s whole blood given one month apart. The control group had no substance injected (needling only). Participants in both groups carried out a standardised and monitored 12 week eccentric calf training programme.1
In total, 53 adults (mean age 49 years, 53% men) were recruited from a sports medicine clinic in New Zealand. Inclusion criteria included age over 18 years and presentation with first episode of mid-portion Achilles tendinopathy. Symptoms had to be present for at least three months, with the diagnosis confirmed by diagnostic ultrasonography.
The primary outcome measure was change in symptoms and function from baseline to six months as assessed by the Victorian Institute of Sport Assessment-Achilles (VISA-A) score. Significant improvements in the VISA-A score were seen at six months in the intervention group (change in score 18.7, 95% confidence interval 12.3 to 25.1) and control group (19.9, 13.6 to 26.2). However, the overall effect of treatment (intervention minus control) at six months was not significant (−1.2, −10.0 to 7.9; P=0.689).
On the basis of the above trial, which of the following conclusions, if any, would be justified?
a) The results would also be applicable to adults at other sports medicine clinics
b) Significant improvements in the primary outcome would continue to be seen at 12 months in both treatment groups
c) The overall effect of treatment would not be significant at 12 months
Statement a would be justified, whereas b and c would not.
The purpose of the above trial was to assess the effectiveness of peritendinous autologous blood injections in patients with mid-portion Achilles tendinopathy. The usefulness of the trial’s results depends on whether they can be generalised. Generalisation refers to the extent that the study results can be applied to patients beyond those in the sample. It is concerned with the characteristics of future patients, including their demography and disease severity, and the extent to which the results of a trial conducted in a different group of people can be applied to them. The trial would have limited value if the results could not be used to predict the benefits for future patients receiving the intervention.
For the results of a study to be useful, it is imperative that the sample members are representative of the study population. However, confusion often exists as to what is meant by the “population” in statistics, probably because it has a different meaning to its general everyday one, where it is used in a geographical sense. In the above trial there were well defined inclusion criteria. Patients were recruited only if aged over 18 years and presenting with their first episode of mid-portion Achilles tendinopathy. Furthermore, symptoms had to be present for at least three months, with the diagnosis confirmed by diagnostic ultrasonography. The inclusion criteria uniquely characterised the population. Statistically, the population would be regarded as an infinite group of people. However, the extent to which the sample in the above trial was representative of the population was not clear. Furthermore, the extent to which any differences that may have existed, such as differences in age distribution, might have influenced the effects of treatment is not obvious.
It is generally expected that the results of most trials will predict the effects of treatment outside the original trial centres. There is no reason to believe that the results for the above trial would not predict the effects of treatment in adults in other sports clinics, hospitals or other countries beside the sports medicine clinic in New Zealand where the trial was conducted (statement a is justified).
Generalisability is also concerned with the extent to which the study results could be applied to a population different to the one studied. For example, patients aged under 18 years those who have had a previous episode of mid-portion Achilles tendinopathy. The extent to which this can be done must be judged separately in each circumstance. In particular, children (those under 18 years) are typically considered to have a different disease course to adults, and it is generally recommended that they are studied separately.
The generalisation of numerical results beyond the study period is termed extrapolation. The study period in the above trial was six months, and generalising the results of the trial beyond six months would not be justified. Although significant improvements in the primary outcome were seen in both treatment groups at six months, the progression of injury severity between six and 12 months cannot be predicted because the trial participants were not studied for this duration (statement b is not justified). It would also not be possible to predict the overall effect of treatment (intervention versus control) at one year (statement c is not justified).
Extrapolation is often described in the context of linear regression, which is used to predict an outcome measure from one or more explanatory variables. Simple linear regression was described in a previous question.2 The regression equation is valid only for the range of the observed data of the explanatory variable(s). It would be possible to predict the outcome measure outside the range of measurements originally observed for the explanatory variables. However, it would not be sensible to do so because there would have been no evidence to support the nature of the association between the outcome measure and explanatory variable(s) outside the original range of measurements.
Cite this as: BMJ 2013;346:f3022
Competing interests: None declared.