Papers

Meta-analysis of how well measures of bone mineral density predict occurrence of osteoporotic fractures

BMJ 1996; 312 doi: http://dx.doi.org/10.1136/bmj.312.7041.1254 (Published 18 May 1996) Cite this as: BMJ 1996;312:1254
  1. Deborah Marshall, research associatea,
  2. Olof Johnell, professorb,
  3. Hans Wedel, professorc
  1. a Swedish Council on Technology Assessment in Health Care, Box 16158, S-10324 Stockholm, Sweden
  2. b Department of Orthopaedics, Malmo General Hospital, S-21401 Malmo, Sweden
  3. c Department of Epidemiology and Biostatistics, Nordic School of Public Health, Box 12133, S-40242 Gothenburg, Sweden
  1. Correspondence to: Professor Wedel.
  • Accepted 29 February 1996

Abstract

Objective: To determine the ability of measurements of bone density in women to predict later fractures.

Design: Meta-analysis of prospective cohort studies published between 1985 and end of 1994 with a baseline measurement of bone density in women and subsequent follow up for fractures. For comparative purposes, we also reviewed case control studies of hip fractures published between 1990 and 1994.

Subjects: Eleven separate study populations with about 90000 person years of observation time and over 2000 fractures.

Main outcome measures: Relative risk of fracture for a decrease in bone mineral density of one standard deviation below age adjusted mean.

Results: All measuring sites had similar predictive abilities (relative risk 1.5 (95% confidence interval 1.4 to 1.6)) for decrease in bone mineral density except for measurement at spine for predicting vertebral fractures (relative risk 2.3 (1.9 to 2.8)) and measurement at hip for hip fractures (2.6 (2.0 to 3.5)). These results are in accordance with results of case-control studies. Predictive ability of decrease in bone mass was roughly similar to (or, for hip or spine measurements, better than) that of a 1 SD increase in blood pressure for stroke and better than a 1 SD increase in serum cholesterol concentration for cardiovascular disease.

Conclusions: Measurements of bone mineral density can predict fracture risk but cannot identify individuals who will have a fracture. We do not recommend a programme of screening menopausal women for osteoporosis by measuring bone density.

Key messages

  • Measuring bone mineral density has been suggested as a method of identifying individuals at high risk of fracture in a preventive context

  • Our meta-analysis of prospective studies showed that all studies measuring bone density at any site had similar predictive ability for a decrease of 1 SD in bone density except for measurements at hip and spine, which have better predictive ability for fractures in hip and spine respectively

  • Predictive ability of decrease in bone mass was roughly similar to (or, for hip or spine measurements, better than) that of a 1 SD increase in blood pressure for stroke and better than a 1 SD increase in serum cholesterol concentration for cardiovascular disease

  • Although bone mineral density measurements can predict fracture risk, they cannot identify individuals who will have a fracture, and a screening programme for osteoporosis cannot be recommended

Introduction

Osteoporosis—low bone density leading to fractures after minimal trauma—is a considerable problem in health care because of its potentially severe consequences for both the patient and the health care system if a fracture occurs. Estimates of the prevalence of osteoporosis vary with the specific definition chosen, but the World Health Organisation has estimated that 30% of all women aged over 50 (postmenopausal) have osteoporosis according to a definition of bone mineral density being more than 2.5 standard deviations below the mean for young healthy adult women at any site.1 Hip fractures are the most serious and costly potential result of osteoporosis: the 300 000 cases in 1991 in the United States were associated with a total cost of about $5bn in 1990, including fees for nursing homes and private and public care services.2 This is also a major problem in other countries. In Europe the Scandinavian countries currently have the highest prevalence of osteoporotic fractures, and the incidence is increasing more than would be expected from demographic changes in age and sex ratio.3

It is therefore important to find ways of preventing osteoporotic fractures. Measuring bone mineral density has been discussed as one method for early identification of individuals at high risk of a fracture, although it is only one of a number of risk factors for fracture.4 5 6 7 Bone density, however, is a continuous variable, and it has been criticised as a screening measure because case-control studies of hip fracture showed a large overlap in the bone densities of patients with a fracture and in the densities of those without a fracture.8

The aim of our study was to determine, by a systematic review of the literature for all prospective studies, if measurements of bone density in women could predict fractures of any type. This was undertaken as one component for a report on measurements of bone density from the Swedish Council on Technology Assessment in Health Care.9 For comparison, we also reviewed case-control studies of hip fractures.

Methods

We identified literature for this report by several methods. Computer databases of the medical literature—Medline, EMBASE, and SweMed—were the primary source but were supplemented with reviews of reference lists of selected papers, references provided by colleagues, and known grey literature on this topic. Bone density must have been measured by absorptiometry (single or dual energy, photon or x ray), quantitative computed tomography, quantitative magnetic resonance imaging, or ultrasound scanning. We did not include studies using roentgenograms or metacarpal measurements. We restricted all searches to articles published from 1985 to 1994. In a preliminary search we found only papers published in English, and we restricted all later searches to articles in English. In an attempt to increase the comparability of the technologies used, methods of measurement, and analytic approaches we excluded articles published before 1985. Bone densitometry with current methods has been performed only since the early 1980s. We based our searches on the keywords “bone and bones,” “bone density,” “bone mineral content,” and “densitometry” combined with the different techniques and equipments. We identified a total of 1084 articles in this way, and 229 studies fulfilled our requirements.

PROSPECTIVE COHORT STUDIES

These were prospective studies with a baseline measurement of bone density and subsequent follow up for fractures. Fractures must have occurred after the bone density measurement had been taken. Study subjects must not have received treatments for bone or hormonal related disorders. The subjects may, however, have had previous fractures at the start of the study. We included only studies of adult women. After we had identified potential articles, they were reviewed independently for inclusion in the overview by three people (an expert in osteoporosis, an epidemiologist or biostatistician, and a worker in public health). Any disagreements about including a study were resolved through discussion.

Before evaluating the results, the reviewers evaluated each paper individually for quality with an instrument developed for our analysis. This considered the potential bias in the study resulting from how the patients were selected, the numbers and types of patients who were lost to follow up, and the method used to identify fractures (the primary outcome in this study). The reviewers then gave each study a quality score, with a maximum possible score of 25.

Calculations

In most cases, there was more than one publication for each study population. We sometimes used different papers to calculate the predictive value of bone density measured at different sites, but in general we used the paper with the longest follow up time for each population as the basis for calculations.

As the main outcome measure of our analysis, we chose relative risk of fracture associated with a decrease in bone density of one standard deviation adjusted for age. This compares the risks for two women of the same age with a difference of 1 SD in their bone density. This measure is robust in that it is independent of population characteristics and measuring devices, and it could also be calculated from all recent studies of good quality.

For pooling study results to a common measure, we used slight modifications of conventional methods.10 Most of the reviewed papers used the Cox proportional hazards model or logistic analysis for estimating relative risks or odds ratios adjusted at least for age. The risks in the reviewed papers were all small, and we considered relative risks and odds ratios as equivalent. One paper calculated risks based on fractures and not on individual people as the statistical entity. Several fractures in one individual do not contribute with independent observations, which underestimates the variance in the analysis.11 We therefore estimated the number of individuals and recalculated the variance in order to control for this bias.

All risk measures were calculated for a 1 SD decrease in bone density. We then pooled these risks to a common risk ratio including all relevant studies and calculated the 95% confidence interval for the pooled risk ratio. More formally, let RRi be the observed age adjusted relative risk for 1 SD decrease in bone density in study number i. Or let bi = ln(RRi) be the β coefficient in the logistic or Cox equation from study i and the weights vi defined by vi = [variance (bi)]-1. From the observed confidence interval for RRi we estimated the variance of bi and also vi. The weighted sum (summation)vibi is normally distributed with variance (summation)vi. The pooled (weighted) estimate from several studies i = 1, …, n is exp ((summation)vibi), and the 95% confidence interval for this entity is calculated as exp[(summation)vibi +/- 1.96 x ((square root)(summation)vi)]. This method, which is a fixed effect model, is asymptotically close to the Mantel-Haenszel method.10

As an alternative method for weighting the individual relative risks, we chose weights proportional to the quality scores described above and calculated the pooled estimates based on scores. We also performed tests of homogeneity at a level of P=0.05.

We calculated the sensitivity, specificity, positive predictive value, and the population attributable risk fraction for a cut point in bone density of 1 SD below the age adjusted mean. We chose three different values for the lifetime incidence of fracture: 3%, 15%, and 30%. A value of 3% represents a low incidence—a group of Swedish women with low risk for hip fractures (not more than two risk factors) had a cumulative incidence of fractures over 10 years of 3% (O Johnell, unpublished data). A lifetime incidence of 30% is possible for a group of women at high risk based on predictors other than bone density (such as tendency to fall). The population attributable risk fraction is the proportion of disease in the population that theoretically could be eliminated if those people with bone density below a certain cut point could be treated to restore normal bone density. The population attributable risk fraction is defined as (I-I0)/I where I is the incidence in the population and I0 is the incidence in the group above the cut point (that is, with “normal” bone density values).

In addition, we compared the ability of bone density measurements to predict fractures with the predictive abilities of blood pressure for stroke (data from Selmer12) and of smoking and serum cholesterol concentration for heart disease (from H Wedel, unpublished data). In order to compare these risk factors for different endpoints, we calculated the relative risk for a change of 1 SD of each risk factor.

CASE-CONTROL STUDIES

These were case-control studies published since 1990 that compared the bone mineral density in women with hip fracture with that in age matched controls who represented the normal population. The measurement of bone density in cases must have occurred within 14 days of fracture. We excluded studies published before 1990 because these have been reviewed elsewhere.5 8 13

In order to compare the results from the cohort and case-control studies, we translated difference in bone mineral density between cases and controls into odds ratios. Since the bone mineral density can be approximated by normal distributions in both cases and controls with the same variance, we used exp(δB) as the odds ratio for a difference of 1 SD in bone mineral density, where δB is the standardised difference in bone mineral density between cases and controls with the variance in controls being used for standardisation. We pooled the results of the studies after using the size of each study for weighting.

Results

PROSPECTIVE COHORT STUDIES

Table 1 give details of the 11 study populations, consisting of about 90000 person years of observation and more than 2000 fractures, that were suitable for inclusion in our analysis.14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Follow up ranged from 1.8 to 24 years. We found no association between the relative risk for decrease in bone density of 1 SD and the length of follow up. The average total scores for quality of the studies ranged from 11.7 to 19.3 out of a possible score of 25. The quality score was not related to the size of the study measured by person years of follow up.

Table 1

Summary details of prospective studies of predictive value of bone density for fractures that were included in meta-analysis

View this table:

We divided the pooled estimates from the meta-analysis by type of fracture—forearm (distal end of the radius), hip, vertebral, and all types—and by measurement site—proximal radius, distal radius, hip lumbar spine, calcaneus, and all sites (overall estimate). The results of this analysis show that most measuring sites had virtually the same predictive ability for a decrease of 1 SD in bone density (table 2). There were two exceptions to this general observation. Measurement at the spine seemed to have a better predictive ability for spine fractures (relative risk 2.3 (95% confidence interval 1.9 to 2.8)), while measurement at the hip was better for predicting hip fractures (relative risk 2.6 (2.0 to 3.5)). The test of homogeneity was rejected for measurements of bone density at the proximal and distal radius and hip for all types of fracture. Weighting for quality scores gave similar results so no further analysis was done in this respect.

Table 2

Summary of meta-analysis: relative risk (95% confidence interval) of fracture for 1 SD decrease in bone density below age adjusted mean

View this table:

Table 3 shows the sensitivity, specificity, positive predictive value, and the population attributable risk fraction for a cut point in bone density of 1 SD below the age adjusted mean. In this case the odds ratio between those below the cut point compared with those above was 4.4. The sensitivity and population attributable risk decreased with increasing lifetime incidence of fractures. The positive predictive value was large for the higher lifetime incidences.

Table 3

Sensitivity, specificity, positive predictive value, and population attributable risk for a cut point in bone density of 1 SD below age adjusted mean associated with three different lifetime incidences of hip fracture. Relative risk of hip fracture is assumed to be 2.6 per 1 SD decrease in bone density

View this table:

Table 4 shows the predictive ability of bone density measurements in comparison with that of risk factors for stroke and heart disease. A 1 SD decrease in bone mass was roughly similar to (or, for hip measurement, better than) the predictive ability of a 1 SD increase in blood pressure for stroke and better than a 1 SD increase in serum cholesterol concentration for cardiovascular disease.

Table 4

Relative risk of fracture for 1 SD decrease in bone density compared with relative risks in women for stroke and coronary heart disease

View this table:

CASE-CONTROL STUDIES

Table 5 gives details of the eight case-control studies of hip fracture that met the inclusion criteria.39 40 41 42 43 44 45 46 The average odds ratio for a difference of 1 SD in bone density was 2.7, 2.8, 2.1, and 1.8 for bone density measurements at the femoral neck, trochanter, Ward's triangle, and lumbar spine respectively. Thus these results also suggest that measurements of bone density at the hip are superior to measurement at the spine for predicting hip fractures.

Table 5

Summary of case-control studies of hip fracture in women (published since 1990)

View this table:

Discussion

PREDICTIVE ABILITY OF BONE DENSITY MEASUREMENTS

For how long can a single bone mineral measurement predict fractures? The longest follow up was for 24 years, and the relative risk of reduced bone density was 1.7 (95% confidence interval 1.1 to 2.6). However, the study cohort was only 191 women. Gardsell et al observed a similar risk in an early study with 11 years of follow up, which showed a decreasing ability to detect fracture risk after the age of 70-80.16 Black et al found a slightly smaller risk ratio after two years of follow up.25

Another question is whether bone density has the same ability to predict fractures at all ages. In a recent paper Nevitt et al showed that bone mineral density could predict fractures equally well over a three year period up to the age of 80: “Thus it seems that an old age bone mineral density can predict fracture, at least for a 3-year-period. However, the life expectancy in Sweden at age of 80 is 8.4 years. Another problem is that very few studies, except Gardsell and others, have studied the predictive ability between age 50 to 60.”28

There are few studies in men, but the predictive ability of bone density measurements seems to be similar for men and women.35

Homogeneity was rejected for measurements of bone density at the proximal radius, distal radius, and hip for all types of fracture (table 2). It seems natural that the studies involving a specific measurement site and a distinct fracture type were more homogeneous, and the results in these studies should be more reliable. A heterogeneity test10 has low power to detect differences between studies and is not a sensible test of homogeneity. When homogeneity was rejected we did not perform a random effect model because of the small numbers of studies but noted the heterogeneity as a warning.

The predictive ability of ultrasound measurement of bone density is uncertain since only one published paper has used this method.34 Two recent abstracts have also been published. One described a retrospective study which showed that bone density measured by ultrasound at the calcaneus was as good in predicting hip fracture as measuring bone density by dual energy x ray or single energy photon absorptiometry at the hip.29 The other abstract reported similar results for vertebral fractures, but, because the ultrasound measurements were not highly correlated compared with other bone density methods, it was suggested that ultrasound might measure different aspects of bone strength to provide complementary information about fracture risk.22

POTENTIAL FOR SCREENING

Bone density measurements can predict risk of fracture but cannot identify individual people who will have a fracture. As with most continuous predictors in medicine, there are no obvious cut off values for screening, and consideration must be given to the potential risks and harms associated with screening. This conclusion is supported by both the prospective cohort data and the case-control data for hip fractures. However, the problem of false positives in predicting fractures may not be as stressful for patients as it would be in screening for cancer.

Law et al made a similar analysis of case-control studies of hip fracture to ours and found a much lower predictive ability for measurements of bone density (odds ratios of 1.35 for all measurement sites and 1.7 for measurement at the femoral neck). They calculated a detection rate of 30% and a false positive rate of 15% for measuring bone density at the femoral neck as a screening tool, based on a cut off of 1 SD below the mean density of the controls.6 We repeated their calculations with our more recent data for hip fractures and bone density measured at the hip (odds ratio 2.6) and found a detection rate of 50% and a false positive rate of 15%.

The sensitivity and population attributable risk for bone density decreased with increasing lifetime incidence of fractures (table 3). This is because the cut point in bone density was defined in the whole population and not defined among the subjects who would remain free of a fracture. In calculating the population attributable risk fraction, which indicates the potential for preventing fractures, we assumed that women with bone densities below the cut point could be treated successfully to increase their bone density to the mean value of the population.

The value of the age adjusted relative risk for 1 SD decrease in bone density should be interpreted cautiously since this assumes that there is a linear relation in the logarithm of the risk, whereas the relation is probably steeper. In a cohort of elderly women Nevitt et al found that those with bone mineral densities below the lowest quartile of bone density (age adjusted) had an incidence of hip fractures of 32/1000 person years compared with incidences of 14/1000, 10.3/1000, and 1.4/1000 for the women with bone densities in the other three quarters (in ascending order) respectively.28 These data suggest that the overall risk ratio for a 1 SD decrease in bone density among women with low bone densities may underestimate the risk since the risk gradient is steeper in the lower ranges of bone density values.

Using bone density measurements alone to predict fractures is analogous to using blood pressure to predict stroke and serum cholesterol concentration to predict coronary heart disease. We found that the predictive ability of bone mass was similar to (or, for hip measurement, better than) that of blood pressure for stroke12 and better than that of serum cholesterol for cardiovascular disease (H Wedel, unpublished data).

Even though bone density can predict fractures in a similar way as blood pressure for stroke, we cannot recommend a screening programme for osteoporosis by measuring bone density. There is a wide overlap in the bone densities of patients who develop a fracture and those who do not. Thus, bone mineral density can identify people who are at increased risk of developing a fracture, but it cannot with any certainty identify individuals who will develop a future fracture. As such, it could possibly be used for selective screening, but there are other factors that speak against this. Too little is known about the practical aspects of such a programme, including compliance rates for attending a screening programme and compliance to the treatment, and there is little information on the effectiveness of treatment, particularly any that show a reduction in fractures.1 47 48

Footnotes

  • Funding Swedish Council on Technology Assessment in Health Care.

  • Conflict of interest None.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
View Abstract