Non-steroidal anti-inflammatory drugs, including cyclo-oxygenase-2 inhibitors, in osteoarthritic knee pain: meta-analysis of randomised placebo controlled trialsBMJ 2004; 329 doi: http://dx.doi.org/10.1136/bmj.38273.626655.63 (Published 02 December 2004) Cite this as: BMJ 2004;329:1317
- Jan Magnus Bjordal (), research fellow1,
- Anne Elisabeth Ljunggren, professor1,
- Atle Klovning, associate professor1,
- Lars Slørdal, professor2
- 1 Department of Public Health and Primary Health Care, University of Bergen, 5018 Bergen, Norway
- 2 Department of Laboratory Medicine, Children's and Women's Health, Norwegian University of Science and Technology, 7489 Trondheim, Norway
- Correspondence to: J M Bjordal
- Accepted 14 September 2004
Objective To estimate the analgesic efficacy of non-steroidal anti-inflammatory drugs (NSAIDs), including selective cyclo-oxygenase-2 inhibitors (coxibs), in patients with osteoarthritis of the knee.
Design Systematic review and meta-analysis of randomised placebo controlled trials.
Studies reviewed 23 trials including 10 845 patients, median age of 62.5 years. 7807 patients received adequate doses of NSAIDs and 3038 received placebo. The mean weighted baseline pain score was 64.2 mm on 100 mm visual analogue scale (VAS), and average duration of symptoms was 8.2 years.
Main outcome measure Change in overall intensity of pain.
Results Methodological quality of trials was acceptable, but 13 trials excluded patients before randomisation if they did not respond to NSAIDs. One trial provided long term data for pain that showed no significant effect of NSAIDs compared with placebo at one to four years. The pooled difference for pain on visual analogue scale in all included trials was 10.1 mm (95% confidence interval 7.4 to 12.8) or 15.6% better than placebo after 2-13 weeks. The results were heterogeneous, and the effect size for pain reduction was 0.32 (0.24 to 0.39) in a random effects model. In 10 trials that did not exclude non-responders to NSAID treatment the results were homogeneous, with an effect size for pain reduction of 0.23 (0.15 to 0.31).
Conclusion NSAIDs can reduce short term pain in osteoarthritis of the knee slightly better than placebo, but the current analysis does not support long term use of NSAIDs for this condition. As serious adverse effects are associated with oral NSAIDs, only limited use can be recommended.
Osteoarthritis of the knee is the most common type of osteoarthritis,1 the prevalence of which is rising in parallel with the increasing age of the population.2 The condition is associated with pain and inflammation of the joint capsule,3–5 impaired muscular stability,6 7 reduced range of motion,8 and functional disability.9 Treatment guidelines for knee osteoarthritis recommend pharmacological intervention, initially with paracetamol and subsequently with a non-steroidal anti-inflammatory drug (NSAID).10 In a recent UK survey, 15% of patients with osteoarthritis used paracetamol, whereas 50% reported regular use of NSAIDs. Of the latter, 32% were using traditional NSAIDs and 18% were using cyclo-oxygenase-2 inhibitors (coxibs).11 This widespread use is one explanation for the interest in tolerability and efficacy issues regarding these drugs.10 12 13 The recent introduction of coxibs seemed to promise a reduction in serious adverse events related to NSAIDs,13 14 but this remains controversial.15–18
Guidelines from the European League Against Rheumatism (EULAR) state that both pharmacological and non-pharmacological interventions are needed for optimal treatment of knee osteoarthritis.19 The various potentially effective pharmacological interventions at the clinicians' disposal19 highlight the need for information regarding treatment efficacy.
Meta-analyses can be used for reliable comparison of the efficacy of different interventions.20 Effect size measures the magnitude of a treatment effect independent of sample size.21 There is no current operational definition for what constitutes a sufficiently large effect size for a therapeutic intervention to be considered as useful, but a value of 0.2 is usually considered small, 0.5 moderate, and 0.8 large.22
A recent systematic review of therapeutic alternatives in knee osteoarthritis gives no effect sizes for paracetamol and an imprecise range (0.47-0.96), derived from a minority of available trials, for NSAIDs.19 Neither other reviewers nor the Cochrane library provide comprehensive and robust effect size data for the efficacy of either of these interventions in osteoarthritis of the knee.10 13 23–25 Calculations of effect size require data for mean change and standard deviation (SD). If not provided, these data can be obtained by indirect means from standard errors, P values, t values, and 95% confidence intervals when sample sizes are known. The lack of data on effect size is surprising because treatment with NSAIDs for knee osteoarthritis is established to the point of being a reference against which other interventions are often compared.
We carried out a meta-analysis of published randomised placebo controlled trials to estimate the analgesic efficacy of NSAIDs, including coxibs, in patients with knee osteoarthritis.
We specified a detailed review of protocol before analysis. This included a sequential three step reviewing procedure of identifying relevant randomised placebo controlled trials from Medline, Embase, and the Cochrane central register of controlled trials; evaluating their methodological quality according to predefined criteria (Jadad scale)26; and calculating their pooled effect as the mean difference in change between NSAID groups and placebo groups in mm on a visual analogue scale and as a unitless effect size.
We carried out the literature search from 1966 to April 2004. In addition, we crosschecked reference lists in systematic reviews, searched conference abstracts, and talked to clinical experts. We included papers in English, German, and Scandinavian. Our key search terms were knee, osteoarthritis, randomised, controlled, placebo, NSAID, coxib, cox-2 inhibitor.
Trials had to study patients whose knee osteoarthritis had been verified by clinical examination according to the American College of Rheumatology criteria and by x ray. The symptoms had to have been present for more than three months. All trials had to be randomised, blinded, placebo controlled, and of parallel design. Pain intensity had to be scored on the subscale of pain on Western Ontario and McMaster Universities osteoarthritis index (WOMAC)27 or on a 100 mm visual analogue scale for one or the mean score of two or more pain dimensions. Functional disability had to be measured on the WOMAC subscale for function.
The intervention groups had to have received matched placebo drug or adequate NSAID dose (except indomethacin)—that is, daily drug dose equal to or exceeding celecoxib 200 mg, diclofenac 100 mg, etodolac 400 mg, etoricoxib 30 mg, ibuprofen 2400 mg, meloxicam 7.5 mg, nabumetone 1500 mg, naproxen 1000 mg, oxaprozin 1200 mg, rofecoxib 12.5 mg, tiaprofenic acid 600 mg, or valdecoxib 10 mg.
Extraction of outcome measure
We used the change in overall pain intensity between the NSAID group and placebo to assess differences. Data were primarily obtained as a mean of the five items on the pain subscale of WOMAC. If WOMAC data were registered on non-continuous scales (categorical, Likert) we converted them to 100 mm visual analogue scales and checked them against other subscales and overall WOMAC score, as this has been found to have good internal consistency.28 If WOMAC data were not available, we used the mean score of knee pain on 100 mm visual analogue scales. If none of the above data were available and more than one type of pain was measured (for instance, pain at rest, pain during walking, etc) we used the mean of these scores.
Statistical analysis of pain relief
We included mean differences of change for intervention groups and placebo groups and their respective standard deviations (SD) in a statistical pooling. If variance data were not reported as SDs, we calculated them from the trial data of sample size and other variance data such as P values, t values, SE of mean, or 95% confidence intervals. Results were presented as weighted mean differences between NSAID and placebo with 95% confidence intervals in mm on visual analogue scales—that is, as a pooled estimate of the mean difference in change between the treatment and the placebo groups, weighted by the inverse of the variance for each study.29 We also combined unitless effect sizes—that is, the standardised mean difference in change between NSAIDs and placebo groups for all included trials weighted by the inverse of the variance for each study.19 A statistical software package (Comprehensive Meta-Analysis, ver.1.0.23, Biostat, Englewood, USA) was used for calculations. We computed homogeneity statistics to test the agreement of the individual trial results with the overall meta-analytical summary. If we detected significant heterogeneity (P < 0.1) we calculated random effects estimates.
Appraisal of trial quality
We assessed the quality of the trials according to a predefined list of criteria.26 To assess the potential for bias we evaluated the method of randomisation, concealment of allocation, blinding of trial investigators and patients, handling of dropouts and withdrawals, and analysis according to intention to treat. In addition, we counted selection criteria for patients and evaluated them for possible bias or dissimilarity to an average population with knee osteoarthritis. We did not predefine cut off limits for method scores.
We evaluated 268 randomised controlled trials of NSAIDs for knee osteoarthritis (fig 1). Of these, 4 did not provide pain scale data, 126 did not have a placebo control group, and 115 presented combined data for osteoarthritis in several sites with no separate data for the knee. This provided a final sample of 23 trials that satisfied the inclusion criteria.30–52 Of these, 16 were sponsored by the pharmaceutical industry,31–34 36 39 40 43 44 46–52 while three others did not state sponsorship but gave an address of a pharmaceutical company as the workplace of most of the authors.37 38 41 The final sample included 10 845 patients, of whom 7767 received NSAIDs and 3078 received placebo (table).
Patients and possible selection bias
The included patients had a median age of 62.5 years, and three trials had an upper age limit of 75 years.35 45 49 There were more women (67.9%) than men, and the median duration of symptoms was 8.2 years. The weighted mean baseline pain of 64.1 mm on a visual analogue scale was calculated from all but three trials.37 51 52 Six reports provided data of body mass index with a median mean value of 31.2.32 35 38 43 44 48 Most trials excluded individuals with concomitant health disorders by a median of 14 exclusion criteria. All trials had a minimum limit for pain intensity or disease activity for inclusion, and they all used a pretreatment washout period of 3-14 days for previous pharmacotherapy. Thirteen trials used an additional criterion by requiring a predefined minimum flare of symptoms when NSAID treatment was discontinued in the pretreatment wash out period.31 33–35 39 40 43 44 46 47 50–52 Five of these trials reported the proportion of regular NSAIDs users, ranging from 66% to 100% (median 90.5%).31 34 35 40 47
The methodological quality was adequate or good (table). All trials were randomised and double blinded, but adequate randomisation procedure, concealed allocation to groups, and blinding procedure were described satisfactorily in only eight studies.33 34 36 39 43 44 46 47 All trials described dropouts and withdrawals well, but one trial did not perform intention to treat analyses.45
Presentation of data on outcome measures
Only four trials presented outcome data as the mean difference of change with SD between NSAIDs and placebo groups.34 42 46 48 Fourteen trials presented the mean difference of change for each group with P values, SE of mean, or 95% confidence intervals but not mean differences between groups.31 33 35–39 41 43 47 49–52 Four trials did not present mean change data but only before and after means and P values.30 32 44 45 Eleven trials presented data on overall pain on 100 mm visual analogue scales or on a categorical five point scale,30–32 37 38 41 42 47 50–52 while the 12 remaining trials presented either categorical or continuous data from the WOMAC subscale for pain. All 23 trials reported data on pain intensity, and 11 trials reported data on functional disability.
Effect size for reduction in functional disability and pain
We excluded from analysis six intervention groups (n = 867) in five trials because patients did not receive an adequate NSAID dose.34–36 39 43 As most trials with multiple time points showed rather stable results from 2-13 weeks, we pooled data. Tests for heterogeneity showed positive results for reduction in functional disability (Q = 39.9, P < 0.01) and reduction in pain (Q = 56.6, P < 0.01). For this reason, we decided to use a random effects model. Eleven trials with 7433 patients provided separate scores for reduction in functional disability,33 35 36 39–41 43 44 46–48 and their combined effect size was 0.29 (95% confidence interval 0.18 to 0.40). One trial reported long term effects on pain but found no significant difference between tiaprofenic acid and placebo at one, two, three, and four years after start of treatment.42 For short term effects (2-13 weeks) the pooled effect size of all included trials was 10.1 mm on visual analogue scale (7.4 to 12.8) or 15.6% better than placebo, and the effect size was 0.32 (0.24 to 0.39) (fig 2).
We carried out post hoc subgroup analyses as heterogeneity was evident and trial procedures differed in selection of patients, duration of treatment, and pain scales. For subgroups of trials that used short durations of treatment (< 6 weeks) or WOMAC subscale for pain, neither effect size (0.35 or 0.39) nor heterogeneity (Q = 48.3, P < 0.01, or Q = 43.0, P < 0.01) changed significantly from the results of the total material. For the subgroup of 10 trials (n = 4565) that did not require patients to have a minimum flare of symptoms after treatment with NSAIDs was stopped before the trial, trial results were homogeneous both for function (Q = 2.6, P = 0.275) and pain (Q = 10.0, P = 0.263).30 32 36–38 41 42 45 48 49 Three of these trials (n = 2928) provided data for reduction in functional disability36 41 48 and we calculated effect size by a fixed effects model to be 0.20 (0.09 to 0.30). For pain reduction, we a used fixed effect model to calculate a pooled effect size of 0.23 (0.16 to 0.31) or 5.9 mm on visual analogue scale (3.8 to 7.9) (fig 3).
NSAIDs can reduce short term pain in osteoarthritis of the knee slightly better than placebo, but the current analysis does not support the long term use of these drugs. Several NSAIDs, of which rofecoxib is the most recent example, have been withdrawn because of adverse effects.53 We initially included rofecoxib in our investigation, but did not include it in the final subgroup analysis. As use of oral NSAIDs may incur serious adverse effects, they can only be recommended for limited use in osteoarthritis of the knee. Although NSAIDs have been used for nearly three decades, most trials included in this review were from the past decade. This is mainly due to the inclusion in older studies of patients with multiple joint osteoarthritis and the lack of separate data for osteoarthritis of the knee.
Strengths and weaknesses of analysis
The strengths of the present study include the acceptable methodological quality of the separate trials on which the analysis is based, as well as the considerable number of trials and patients included. We also translated categorical WOMAC data and P values, t test results, standard error of mean values, and before and after values to mean differences in change between groups to avoid bias. For instance, we excluded all groups with less than adequate NSAID doses from the efficacy calculations.
One possible limitation of our study is that we included nine trials in which outcomes were recorded with fewer than the five pain dimensions covered by the WOMAC pain subscale.31 32 37 38 41 42 45 47 49 We thought that this, as well as the different time points in the individual studies (2, 3, 4, 6, 12, and 13 weeks) for registering outcome measures, could explain the heterogeneity that was found in the trial results, but heterogeneity persisted after we performed relevant subgroup analyses. Heterogeneity was not apparent, however, when we performed a subgroup analysis of trials that did not exclude non-responders to NSAIDs. The validity of requiring a certain increase in symptoms after discontinuation of NSAIDs before inclusion in an NSAID trial seems dubious. Indeed, our results show that this procedure significantly inflates effect sizes in favour of the trial drug. In a clinical setting, it may nevertheless be useful to have information about the magnitude of effect to be expected in patients who are known responders to NSAIDs. In comparisons of various treatments, however, the selective exclusion of people who do not respond to NSAIDs among patients given this type of therapy will introduce bias in favour of NSAID efficacy and may hence be inappropriate.
Another possible source of selection bias in patients is that the average age of the participants was low (62.5 years) for people with osteoarthritis of the knee, reflecting the exclusion of patients above 75 years of age in some trials.35 45 49 Data on efficacy and tolerability as a function of age were reported in only one comparatively small trial.45
Benefits of NSAIDs
NSAIDs significantly reduce pain in acute conditions.54 55 In chronic conditions, however, patients have reported that pain has to be reduced by about 30% to be considered meaningful.56 For knee osteoarthritis in particular, an effect size of 0.4 or 17-22% change from baseline has been calculated from empirical data and suggested as the minimal clinically important change.57 Other authors have found that the least perceptible change in pain from osteoarthritis of the knee is 9.7 mm measured by the WOMAC subscale for pain.58 In accordance with this, the effect size of 0.32 or 10.1 mm on visual analogue scale for pain reduction and the effect size of 0.29 for disability reduction may be considered too small to be clinically significant. This may in turn explain non-compliance with prescribed drug therapy in 29% of patients and the use of non-conventional drug therapy by one in four patients with osteoarthritis.11
The widespread and long term use of NSAIDs among elderly people with osteoarthritis is associated with considerable side effects. NSAIDs cause serious gastrointestinal complications such as bleeding or perforation in one in 50-100 patient years, and this risk increases with age, concurrent use of other medications, and probably also duration of treatment.12 Substantial epidemiological and experimental data show that NSAIDs may increase blood pressure,59 and NSAID use has been linked to the development and acceleration of congestive heart failure.60 Elderly patients also have an increased risk for development of associated renal failure.61 In addition, NSAID users are at risk of interactions, including pharmacodynamic interactions with anti-hypertensive drugs59 and pharmacokinetic interactions with compounds eliminated by renal excretion, such as lithium.62 These important caveats were not considered in the short term studies of NSAIDs that we included. Thus, it may be reasonable to assume that the benefits of NSAIDs may be less and the harmful effects more common in an unselected population of patients with knee osteoarthritis compared with the patients in these studies.
What is already known on this topic
Current guidelines recommend the use of oral non-steroidal anti-inflammatory drugs (NSAIDs) in the treatment of osteoarthritis of the knee
Oral NSAIDs are used regularly by half of all patients with painful osteoarthritis
What this study adds
The advantage of oral NSAIDs over placebo for short term pain relief is small and probably clinically insignificant
Evidence of long term effects from oral NSAIDs is still lacking
Contributors JMB had the original idea and designed the review together with LS. JMB and AK performed the literature search. AEL and JMB assessed methodological trial quality. JMB, LS, and AK analysed the statistical data. AEL, JMB, and LS wrote the report, while AK contributed in layout and proof reading. JMB is the guarantor.
Competing interests None declared.
Ethical approval Not required.