Intended for healthcare professionals

CCBYNC Open access
Research

Effectiveness and safety of non-steroidal anti-inflammatory drugs and opioid treatment for knee and hip osteoarthritis: network meta-analysis

BMJ 2021; 375 doi: https://doi.org/10.1136/bmj.n2321 (Published 12 October 2021) Cite this as: BMJ 2021;375:n2321
  1. Bruno R da Costa, associate director, associate professor, adjunct professor123,
  2. Tiago V Pereira, research scientist14,
  3. Pakeezah Saadat, doctoral student12,
  4. Martina Rudnicki, research fellow15,
  5. Samir M Iskander, medical student16,
  6. Nicolas S Bodmer, doctoral student, research associate17,
  7. Pavlos Bobos, postdoctoral fellow128,
  8. Li Gao, postdoctoral researcher19,
  9. Henry Dan Kiyomoto, professor10,
  10. Thais Montezuma, researcher11,
  11. Matheus O Almeida, researcher, professor1112,
  12. Pai-Shan Cheng, doctoral student113,
  13. Cesar A Hincapié, clinician scientist1415,
  14. Roman Hari, research fellow, senior lecturer13,
  15. Alex J Sutton, professor of medical statistics4,
  16. Peter Tugwell, professor1617,
  17. Gillian A Hawker, professor18,
  18. Peter Jüni, director, professor1218
  1. 1Applied Health Research Centre, Li Ka Shing Knowledge Institute of St Michael's Hospital, Toronto, ON, Canada
  2. 2Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada
  3. 3Institute of Primary Health Care (BIHAM), University of Bern, Switzerland
  4. 4Department of Health Sciences, University of Leicester, Leicester, UK
  5. 5Institute of Ophthalmology, University College London, London, UK
  6. 6Schulich School of Medicine, University of Western Ontario, London, ON, Canada
  7. 7Department of Medicine, University of Zurich, Zurich, Switzerland
  8. 8Western’s Bone and Joint Institute, Western University, London, ON, Canada
  9. 9School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, China
  10. 10Department of Physiotherapy, Faculty of the Americas, São Paulo, Brazil
  11. 11Health Technology Assessment Unit, Oswaldo Cruz German Hospital, São Paulo, Brazil
  12. 12Master Program in Physical Therapy, Universidade Ibirapuera, São Paulo, Brazil
  13. 13Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
  14. 14Department of Chiropractic Medicine, Faculty of Medicine, University of Zurich and Balgrist University Hospital, Zurich, Switzerland
  15. 15Epidemiology, Biostatistics and Prevention Institute (EBPI), University of Zurich, Zurich, Switzerland
  16. 16Department of Medicine, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
  17. 17Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada
  18. 18Department of Medicine, University of Toronto, Toronto, ON, Canada
  1. Correspondence to B R da Costa bruno.dacosta{at}utoronto.ca
  • Accepted 13 September 2021

Abstract

Objective To assess the effectiveness and safety of different preparations and doses of non-steroidal anti-inflammatory drugs (NSAIDs), opioids, and paracetamol for knee and hip osteoarthritis pain and physical function to enable effective and safe use of these drugs at their lowest possible dose.

Design Systematic review and network meta-analysis of randomised trials.

Data sources Cochrane Central Register of Controlled Trials (CENTRAL), Medline, Embase, regulatory agency websites, and ClinicalTrials.gov from inception to 28 June 2021.

Eligibility criteria for selecting studies Randomised trials published in English with ≥100 patients per group that evaluated NSAIDs, opioids, or paracetamol (acetaminophen) to treat osteoarthritis.

Outcomes and measures The prespecified primary outcome was pain. Physical function and safety outcomes were also assessed.

Review methods Two reviewers independently extracted outcomes data and evaluated the risk of bias of included trials. Bayesian random effects models were used for network meta-analysis of all analyses. Effect estimates are comparisons between active treatments and oral placebo.

Results 192 trials comprising 102 829 participants examined 90 different active preparations or doses (68 for NSAIDs, 19 for opioids, and three for paracetamol). Five oral preparations (diclofenac 150 mg/day, etoricoxib 60 and 90 mg/day, and rofecoxib 25 and 50 mg/day) had ≥99% probability of more pronounced treatment effects than the minimal clinically relevant reduction in pain. Topical diclofenac (70-81 and 140-160 mg/day) had ≥92.3% probability, and all opioids had ≤53% probability of more pronounced treatment effects than the minimal clinically relevant reduction in pain. 18.5%, 0%, and 83.3% of the oral NSAIDs, topical NSAIDs, and opioids, respectively, had an increased risk of dropouts due to adverse events. 29.8%, 0%, and 89.5% of oral NSAIDs, topical NSAIDs, and opioids, respectively, had an increased risk of any adverse event. Oxymorphone 80 mg/day had the highest risk of dropouts due to adverse events (51%) and any adverse event (88%).

Conclusions Etoricoxib 60 mg/day and diclofenac 150 mg/day seem to be the most effective oral NSAIDs for pain and function in patients with osteoarthritis. However, these treatments are probably not appropriate for patients with comorbidities or for long term use because of the slight increase in the risk of adverse events. Additionally, an increased risk of dropping out due to adverse events was found for diclofenac 150 mg/day. Topical diclofenac 70-81 mg/day seems to be effective and generally safer because of reduced systemic exposure and lower dose, and should be considered as first line pharmacological treatment for knee osteoarthritis. The clinical benefit of opioid treatment, regardless of preparation or dose, does not outweigh the harm it might cause in patients with osteoarthritis.

Systematic review registration PROSPERO number CRD42020213656

Introduction

Osteoarthritis is a clinical syndrome that most commonly affects knee and hip joints in older people.1 Osteoarthritis is a painful condition and results in reduced physical function and quality of life, and increased risk of all cause mortality.234 Topical or oral non-steroidal anti-inflammatory drugs (NSAIDs) followed by paracetamol (acetaminophen) or opioids comprise first line pharmacotherapy.567 In the United States, 65% of patients with osteoarthritis are prescribed NSAIDs and 71% opioids for pain management, and opioid prescriptions for musculoskeletal pain increased by 70% between 2001 and 2010.8 In the UK, 84% of all patients diagnosed with osteoarthritis between 2000 and 2015 were prescribed opioids, with increasing numbers of prescriptions and prescribed doses in more recent years.9

Clinicians are faced with a myriad of options when prescribing pharmacotherapy, posing a challenge to clinical decision making. Evidence suggests that improvements in pain and physical function could be similar for opioids and NSAIDs, but opioids cause considerably more adverse events.1011 Seven of the 10 recommendations made in a recent guideline for opioid treatment in chronic non-cancer pain are focused on harm reduction12 because of a substantial risk associated with opioid use.13 In addition to the immediate reactions after opioid use, such as nausea, vomiting, and drowsiness,14 chronic use is associated with increased risk of fractures, cardiovascular events, opioid dependence, and mortality.15 Globally, opioid use disorder increased 23% between 2005 and 2015.16 Over half of all global overdose deaths in 2019 occurred in the US, with liberal prescribing of high dose opioids one of the main contributors.17 Between 2000 and 2017, opioid related mortality increased by 593% in Canada.18 Despite this evidence, and international concerns about the devastating potential for chemical dependency,719 opioids remain among the most prescribed drugs for osteoarthritis pain in the UK, the US, Canada, and Australia, even though safer treatments with stronger analgesic effects are available.920212223

Previous systematic reviews have reported the effectiveness of NSAIDs and opioids to treat osteoarthritis pain.2425262728 However, these reviews clustered drug doses or drug classes in their analyses. This clustering does not provide enough granular evidence to allow the implementation of current recommendations that physicians should prescribe the lowest while still effective dose of these interventions.67 To present granular evidence and enable a safer prescription of these interventions, we assessed the effectiveness and safety of different preparations and doses of NSAIDs, opioids, and paracetamol for knee and hip osteoarthritis pain and physical function. We integrated all available high quality evidence from randomised trials in a network meta-analysis.

Methods

We followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines to prepare this article (see web-appendix 1 for protocol; PROSPERO registration CRD42020213656).

Eligibility criteria

We considered large randomised trials of patients with knee or hip osteoarthritis that compared any of the following interventions for pharmacological treatment of osteoarthritis pain: NSAIDs, opioids, paracetamol (acetaminophen), or placebo. Trials that included patients with other types of arthritis or joints other than knee or hip were only included if ≥75% of the patients had confirmed knee or hip osteoarthritis. Additionally, trials were required to have at least one follow-up measurement of pain or another algofunctional outcome. To reduce small study bias, we only included trials with an average of ≥100 participants randomised per arm.29 No publication status or year restriction was applied, but we limited the publication language to English.

Identification of trials

To identify eligible trials, we conducted searches on the Cochrane Central Register of Controlled Trials (CENTRAL), from inception to 30 June 2021, and Medline and Embase from inception to 28 June 2021 (web-appendix 2). We also manually searched the reference lists of retrieved articles and systematic reviews (web-appendix 3), and searched ClinicalTrials.gov. When data were incomplete, we searched for additional data on ClinicalTrials.gov, WHO approved trial registries, company specific trial registries, and documents available on the website of the US Food and Drug Administration (FDA).

Selection of studies and data extraction

Each trial was independently evaluated by two reviewers (BRdC, CAH, HDK, LG, MOA, MR, NSB, PB, PSC, PS, RH, SMI, TM, and TVP) for screening and data extraction. When disagreements occurred, a consensus was reached through discussion among reviewers or by consultation with a senior scholar. We screened trials for eligibility, extracted data, and developed consensus by using a standardised and piloted web based data management tool, accompanied by a codebook. We extracted trial characteristics, such as design, size and duration; intervention characteristics, such as dose and treatment duration; participant characteristics, such as mean age, sex, mean duration of symptoms, index joint; type of outcome (pain or function); and outcome data for each time point of interest. If necessary, we approximated summary statistics and measures of variability from graphs. When possible, results based on the intention-to-treat principle were extracted.

Outcomes

Our prespecified primary outcome was pain. If a trial presented pain outcomes on more than one scale, the following hierarchical list was used to extract data from the scale highest on the list3031: (1) global osteoarthritis pain assessed using visual analogue or numerical rating scales; (2) pain on walking; (3) Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) pain subscore; (4) composite pain scores other than WOMAC; (5) pain on activities other than walking (such as stair climbing); (6) WOMAC global score; (7) Lequesne osteoarthritis index score; (8) other algofunctional composite scores; (9) patient’s global assessment; (10) physician’s global assessment. Data were extracted for several time points when available: 1 week (±2 days), 2 weeks (±2 days), 4 weeks (3-4.5 weeks), 6 weeks (±1 week), 12 weeks (±4 weeks), 24 weeks (±4 weeks), 48 weeks (±4 weeks), and at the end of treatment if not covered by the specific time points. The extraction of several time points allowed us to conduct advanced modelling to achieve more statistical precision. However, we only present results at six weeks (primary analysis), and one and 12 weeks (sensitivity analyses).

Our secondary outcome was physical function. If a trial presented function outcomes on more than one scale, the following hierarchical list was used to extract data from the scale highest on the list30: (1) global osteoarthritis function score; (2) walking disability; (3) WOMAC physical function subscore; (4) composite physical function scores other than WOMAC; (5) physical function on activities other than walking (such as stair climbing); (6) WOMAC global score; (7) Lequesne osteoarthritis index score; (8) other algofunctional composite scores; (9) patient’s global assessment; (10) physician’s global assessment. We extracted and analysed data on this outcome for the same time points as those described for pain.

Our main safety outcome was dropouts or withdrawals due to adverse events. Other safety outcomes were any adverse event and serious adverse events. We extracted serious adverse events as defined by investigators. This outcome was typically defined as those resulting in hospital admission, prolongation of hospital stay, persistent or major disability, congenital abnormality of offspring, life threatening events, or death.32

Risk of bias assessment

We judged trial risk of bias for seven domains: random sequence generation, allocation concealment, blinding of patients, blinding of the therapist, blinding of outcome assessor for pain, blinding of outcome assessor for function, and completeness of outcome data (web-appendix 4).333435

Statistical methods

For the analysis of pain and physical function outcomes, we used an extension of multivariable Bayesian random effects models for network meta-analysis (web-appendix 5).333637 These models fully preserve the direct randomised comparison within each trial, but allow the comparison of all available interventions across trials, and account for multiple comparisons in multiarm trials.3839 The model includes random effects at the level of trials, and uses a random walk to account for the correlation of longitudinal outcome data within trials reporting results for more than one time point, borrowing strength across time points for an estimate. The model assumes that, within a trial with longitudinal outcome data, the data recorded at a specified time point are more similar to the outcome data recorded at adjacent time points immediately before and after than at non-adjacent, more remote time points.40 Pain and physical function treatment effects are presented as standardised mean differences. We present the results for six week follow-up for all analyses, and a sensitivity analysis was conducted for one and 12 week follow-up.

We also adjusted the results of the primary outcome for trial characteristics (concealment of allocation, blinding of patient, therapist, or outcome assessor, completeness of outcome data) by incorporating a regression coefficient in the model. We estimated two sided P values for interaction between treatment effects and trial characteristics from the posterior distribution. We assessed potential dose-response relations by introducing preparation specific covariates and assuming linearity on log relative dose in a separate model.33 We used different assumptions about the prior distribution of between trial variance to assess the robustness of our analysis (web-appendix 1). To analyse safety outcomes, we calculated odds ratios by using a random effects network meta-analysis Bayesian model with binomial likelihood and logit link.41

For all variables, minimally informative prior distributions were used (web-appendix 1).33 All estimates reported are medians with corresponding 95% credible intervals from the 2.5th and 97.5th percentiles of the posterior distribution, unless stated otherwise. We calculated effect sizes within trials by dividing the difference in mean values between treatment groups at a specific time point by the median pooled standard deviation recorded across all time points.42 If standard deviations were not provided, we calculated them from standard errors or confidence intervals,4344 or imputed them using a regression based model if required (web-appendix 6). We assessed the goodness of fit of the model to the data by calculating the number of means of standardised residuals that were within 1.96 of the standard normal distribution at the level of interventions (node level); visually inspecting the distribution of residuals on Q-Q plots; comparing the posterior mean residual deviance with the number of data points; calculating the heterogeneity of treatment effects estimated from the posterior median between trial variance45 τ2; and assessing the consistency of the network (determined by the difference in effect sizes derived from direct and indirect comparisons).46 Consistency was assessed by using a stepwise approach.47 We first compared the model fit of consistency and inconsistency models using the deviance information criterion for an omnibus assessment of consistency.47 If the inconsistency model had a better deviance information criterion than the consistency model, we would then use node splitting to identify inconsistent loops within the network.47 Model convergence was assessed with the Brooks-Gelman-Rubin diagnostic, trace plots, and autocorrelation plots.

We estimated the probability for the effect of the experimental intervention to reach the between group minimum clinically important difference (MID) of −0.37 standard deviation units to facilitate the interpretation of estimated treatment effects.35 This threshold of 0.37 standard deviation units is based on the median between group anchor based MID reported in studies in patients with osteoarthritis.35 An effect size of 0.37 corresponds to a between group difference of 9 mm on a 100 mm visual analogue scale. We conducted a random effects meta-analysis to calculate the pooled probability of dropping out due to an adverse event in the oral placebo group, which was 5%. The logit of this probability and the log odds ratios of comparisons to oral placebo were used to derive the probability of dropping out due to an adverse event for all other interventions. Analyses were done with Stata 15.1 (StataCorp, TX, USA), OpenBUGS (OpenBUGS Project Management Group, version 3.2.3), and R version 3.6.1.48

Patient and public involvement

Patients and the public were not involved in the design, conduct, reporting, or dissemination plans of this systematic review and network meta-analysis due to lack of resources to allow their participation.

Results

Search results and characteristics of included trials

We screened 21 713 references of which 153 were found to be eligible (fig 1). Twenty eight additional trials were identified from the reference lists of relevant papers and in ClinicalTrials.gov. In total, we included data from 181 publications describing 192 trials. These trials involved 102 829 participants with a median sample size of 466 participants (interquartile range 309-718). Considering different drug preparations and doses, a total of 90 active intervention nodes (hereafter referred to as interventions) were examined for at least one of the outcomes of interest: 68 NSAIDs, 19 opioids, three paracetamol (table 1, table 2, table 3, and web-appendix 7 and 8). Additionally, three control interventions were included in the network: oral, topical, and oral plus topical placebo. Celecoxib 200 mg/day was the most frequently investigated intervention (44 trials). Only safety data were available for nine interventions (web-appendix 7). Web-appendix 9 shows the network of interventions included in the pain outcome analysis.

Table 1

Characteristics of randomised trials included in systematic review and network meta-analysis

View this table:
Table 2

Non-steroidal anti-inflammatory drug effect estimates for pain and dropouts due to adverse events compared with oral placebo

View this table:
Table 3

Opioids, paracetamol, and topical placebo effect estimates for pain and dropouts due to adverse events compared with oral placebo

View this table:

The mean age of participants in included trials ranged from 48 to 72 years, the percentage of female participants ranged from 13% to 91%, the median average time since osteoarthritis diagnosis was 6.6 years (interquartile range 5.3-8.6), the median average baseline pain on a 10 cm scale was 6.5 (interquartile range 5.7-7.0), and the median total follow-up was 8.6 weeks (interquartile range 6-12 weeks; table 1 and web-appendix 10). Web-appendix 11 shows the characteristics of each individual trial: 91% of trials had a low risk of bias for blinding of patients, 83% for blinding of therapists, 91% for blinding of pain outcome assessor, 93% for blinding of function outcome assessor, 25% for incomplete pain outcome data, 26% for incomplete function outcome data, and 15% for allocation concealment (web-appendix 12). Eighty per cent of trials received financial funding from a commercial body and the source of funding was unclear in 19%.

Pain

Table 2, table 3, and web-appendix 7 present pooled effect estimates with 95% credible intervals for comparisons to oral placebo for all outcomes. Figure 2 and figure 3 display pooled effect estimates for pain, ordered by the magnitude of treatment effects: 38 of 51 (74.5%) oral NSAIDs, 3 of 9 (33.3%) topical NSAIDs, 4 of 18 (22.2%) opioids, and 1 of 3 (33.3%) paracetamol interventions showed statistical superiority over oral placebo. For seven comparisons, all of which were NSAIDs (aceclofenac 200 mg/day, diclofenac 150 mg/day, etoricoxib 60 and 90 mg/day, rofecoxib 25 and 50 mg/day, and diclofenac topical 140-160 mg/day), we found robust statistical evidence for treatment effects that were more pronounced than the MID (the probability that the effect size compared with oral placebo is −0.37 or lower was ≥95%). Of these comparisons, five (diclofenac 150 mg/day, etoricoxib 60 and 90 mg/day, and rofecoxib 25 and 50 mg/day) had ≥99% probability.

Fig 2
Fig 2

Treatment effect on osteoarthritis pain and dropouts due to adverse events compared with oral placebo, ordered according to treatment effect size on osteoarthritis pain. Blue: oral non-steroidal anti-inflammatory drugs; green: topical non-steroidal anti-inflammatory drugs; orange: opioids. Area between dashed lines shows treatment effect estimates below the minimum clinically important difference. See web-appendix 17 for caterpillar plot ordered according to odds ratio of dropouts due to adverse events. See table 2, table 3, and web-appendix 7 for specific estimates with 95% CrI and additional outcome. *Maximum daily recommended dose. CrI=credible interval

Fig 3
Fig 3

Continuation of figure 2. Treatment effect on osteoarthritis pain and dropouts due to adverse events compared with oral placebo, ordered according to treatment effect size on osteoarthritis pain. Blue: oral non-steroidal anti-inflammatory drugs; green: topical non-steroidal anti-inflammatory drugs; orange: opioids; pink: paracetamol; black: placebo. Area between dashed lines shows treatment effect estimates below the minimum clinically important difference. See web-appendix 17 for caterpillar plot ordered according to odds ratio of dropouts due to adverse events. See table 2, table 3, and web-appendix 7 for specific estimates with 95% CrI and additional outcome. *Maximum daily recommended dose. CrI=credible interval

Topical diclofenac was the most promising topical treatment with ≥92.3% probability of more pronounced treatment effects than the MID, regardless of dose. Although some evidence was found of small treatment effects for opioids, all had ≤53.0% probability of treatment effects being more pronounced than the MID. Estimates in table 2 and table 3 indicate that, for some interventions, treatment effect increased with increasing treatment dose. However, there was generally a wide overlap of 95% credible intervals across doses of the same preparation, with only celecoxib, etoricoxib, naproxcinod, and tramadol having a significant dose dependency (P≤0.04; web-appendix 13). However, neither celecoxib nor tramadol had a high probability of treatment effect being more pronounced than the MID (≤24.7% and ≤18.1%, respectively). Results shown in web-appendix 14 indicate that treatment effects are generally similar from one week to 12 weeks of follow-up. No strong evidence was found that risk of bias indicators influenced trial results in a systematic way (all P for interaction values ≥0.13; web-appendix 15). Results were much the same regardless of the prior distribution assumed for the between trial variance (web-appendix 16).

Physical function

For physical function, pooled estimates indicate that all interventions improved function compared with oral placebo except for two: nabumetone 1000 mg/day and paracetamol <2000 mg/day (web-appendix 7). Thirty of 39 (76.9%) oral NSAIDs, 3 of 9 (33.3%) topical NSAIDs, 4 of 13 (30.8%) opioids, and 1 of 3 (33.3%) paracetamol interventions showed statistical superiority over oral placebo. For two comparisons with oral placebo (rofecoxib 25 mg/day and naproxcinod 1500 mg/day), the upper bound of the 95% credible interval excluded treatment effects that were above the MID of −0.37.

Safety

Web-appendix 7 presents the results from the analysis of dropouts due to adverse events, any adverse event, and serious adverse events, and table 2 and table 3 show the results for dropouts due to adverse events. Web-appendix 17 displays pooled effect estimates for dropouts due to adverse events, ordered by the magnitude of harmful effects. Evidence was found, with the 95% credible intervals of odds ratios excluding the null effect, of an increased number of participants who dropped out of the trial due to an adverse event compared with placebo for 10 of 54 (18.5%) oral NSAIDs, 0 of 8 (0%) topical NSAIDs, 15 of 18 (83.3%) opioids, and 1 of 3 (33.3%) paracetamol interventions. There was strong evidence, with the lower bound of the 95% credible interval of odds ratios above 2, of an increased number of participants who dropped out due to an adverse event compared with placebo only for opioid interventions (9 of 18; 50%). Evidence was found of an increased risk of any adverse event for 14 of 47 (29.8%) NSAIDs, 0 of 9 (0%) topical NSAIDs, 17 of 19 (89.5%) opioids, and paracetamol 3900-4000 mg/day. Only oxycodone ≥48 mg had evidence of an increased risk of serious adverse events (odds ratio 2.40, 95% credible interval 1.09 to 5.60). Oxymorphone 80 mg/day had the highest risk of dropouts due to adverse events (51%) and any adverse event (88%).

Figure 4 plots the probability of interventions having a treatment effect on osteoarthritis pain beyond the MID against the probability of participants dropping out due to an adverse event. The plot shows that some oral and topical NSAID interventions had >90% probability of having a treatment effect beyond the MID while still having a probability of participants dropping out similar to that observed with oral placebo (5%). The plot also shows that participants who received opioids tended to have a much lower probability of experiencing a clinically relevant reduction in their pain compared with oral placebo, but a much higher risk of interrupting treatment due to an adverse event.

Fig 4
Fig 4

Two-dimensional graph showing probability of drugs having minimum clinically important difference compared with oral placebo and probability of participants interrupting treatment due to adverse event. Probability of participants on oral placebo dropping out due to adverse event is 5%. MID=between group minimum clinically important difference; NSAID=non-steroidal anti-inflammatory drug

Model fit, heterogeneity, and inconsistency assessment

The model fit was good for all outcomes (web-appendix 18 and 19). τ2 estimates suggest low statistical heterogeneity for all outcomes, except for serious adverse events, which had a small to moderate statistical heterogeneity49: pain (0.010, 95% credible interval 0.007 to 0.015), physical function (0.010, 0.006 to 0.015), dropouts due to adverse events (0.046, 0.006 to 0.097), any adverse event (0.026, 0.010 to 0.049), and serious adverse events (0.118, 0.001 to 0.498). For all outcomes, the deviance information criterion was better in the consistency model than in the inconsistency model (web-appendix 20).

Discussion

Main findings

This network meta-analysis, including 192 trials and 102 829 participants, compared the effectiveness and safety of 90 active treatment regimens of NSAIDs, opioids, and paracetamol with oral placebo. Diclofenac 150 mg/day and etoricoxib 60 mg/day appear to be the most effective interventions to improve pain in patients with knee or hip osteoarthritis. Diclofenac 150 mg/day had an effect size of −0.56 and etoricoxib 60 mg/day had an effect size of −0.65, corresponding to 14 mm and 16 mm differences on a 100 mm visual analogue scale, respectively; these figures are 1.5 and 1.8 times the between group MID for chronic pain of an effect size of −0.37. There was 99.9% probability that these interventions have treatment effects that are more pronounced than the between group MID. While these two interventions showed similar increased risks of any adverse event compared with placebo (odds ratio 1.27 and 1.56, respectively), patients receiving etoricoxib 60 mg/day seemed to have a lower risk of stopping the treatment due to an adverse event. Etoricoxib 60 mg/day also seemed to result in a lower risk of patients experiencing a serious adverse event than diclofenac 150 mg/day, but effect estimates were imprecise.

Among topical treatments, diclofenac, regardless of dose, had the largest effect on pain and physical function. The lowest dose of topical diclofenac (70-81 mg/day) had a 92% probability of having a minimum clinically relevant improvement on pain, with a better safety profile than oral diclofenac. Among the NSAID preparations most commonly prescribed in the US, meloxicam and diclofenac were more effective and had similar safety outcomes compared with ibuprofen and naproxen at their respective maximum recommended daily doses.50 While none of the opioid interventions, regardless of dose, seemed to have a clinically relevant effect on pain or physical function, their safety profiles were in general significantly worse than the other interventions in our analysis, with higher doses of opioids leading to a higher risk of harm. Tramadol, regardless of dose, had a small treatment effect on pain and physical function (effect sizes ≥−0.31; probabilities to reach the MID ≤18.1%). Among non-tramadol opioids, tapentadol seemed to be the most effective in treating pain (effect size −0.34; probability to reach the MID 33.9%). Paracetamol 3900-4000 mg/day had the lowest effect on osteoarthritis pain, with an effect size of −0.15, corresponding to a 4 mm difference on a 100 mm visual analogue scale, similar to previously reported results.33

Strengths and limitations

This was a large network meta-analysis on pharmacological treatments for knee and hip osteoarthritis. Even after restricting our inclusion criteria to large trials only, our sensitive search strategy and careful search of the grey literature resulted in a 2.5-fold to fivefold increase in the number of trials and patients included compared with previous reviews specific to NSAIDs, opioids, or paracetamol (web-appendix 3). The large number of trials and patients included allowed us to generate granular evidence of effectiveness and safety with enough statistical precision for several different types of interventions and doses. By excluding trials with small sample sizes, we minimised the risk of biases caused by small study effects.29 Web-appendix 12 shows that trials included in our analyses generally had a lower risk of bias, which is reassuring. Additionally, conclusions based on our analyses adjusted by risk of biases were in line with the unadjusted analysis. The robustness and accuracy of our results are further supported by the low between trial heterogeneity, no indication of inconsistency in the network, good model fit, and similar treatment effects for pain and function. The length of follow-up in about half of the included trials was three months or less, which we believe is an accurate representation of current clinical practice, and is in line with current clinical practice guidelines.67 Treatment effect estimates were consistent for one, six, and 12 weeks of follow-up, which is further evidence of the robustness of our findings and of the effectiveness of these drugs from short to mid-term use.

The risk of harm of the oral treatments we analysed is well established, therefore they are typically prescribed on an as-needed basis, with intermittent short to mid-term use and varying doses as required, rather than a daily fixed dose treatment regimen as seen in the trials included in our analysis. Because the average treatment duration in included trials was less than three months, our findings for safety outcomes should not be generalised to long term use of the interventions considered in our analysis because harmful effects probably become more frequent and severe as treatment duration increases. Current trials do not allow a proper exploration of the effectiveness and safety of NSAIDs in the presence of comorbidities, which requires careful consideration when using our findings to guide clinical practice. Patients with comorbidities that lead to a higher risk of adverse events are underrepresented in the trials included in our analysis. Therefore, the safety estimates from our analyses are mainly generalisable to patients with a lower risk of experiencing adverse events, and are probably conservative estimates for frail patients with multimorbidities. Future pragmatic trials that investigate the longer term use of NSAIDs on an as-needed basis and varying doses as required in patients with osteoarthritis and comorbidities, and that report cause specific adverse events, will enable safer use of these drugs in this patient group.

Data on cause specific adverse events, such as gastrointestinal and cardiovascular, are helpful to physicians and patients in the presence of comorbidities. A proper comparison of cause specific adverse events for different types of drugs in a network meta-analysis would require drug doses to be taken into account. Only a small proportion of the trials reported cause specific adverse events, therefore analysis at the level of preparation and doses was not possible in the present network meta-analysis. It is conceivable that different types of topical NSAID formulations (eg, patch, cream, gel) might have different effectiveness and safety profiles. The current analysis did not include enough trials to explore this potential effect modification with adequate statistical precision. As new evidence from large trials becomes available, a future network meta-analysis on topical NSAIDs could explore potential effect modification of their effectiveness and safety according to different types of formulations.

The risk of performance bias introduced by therapists was unclear for some trials, and most trials had a high risk of incomplete outcome bias because they used last observation carried forward to account for missing data. However, conclusions based on our analyses adjusted by risk of biases were in line with the unadjusted analysis.29 We must recognise the limitations of our analysis model, as previously discussed.33 Because study specific covariance estimates are rarely reported, the dependency of outcome data over time within a trial is only approximately represented through the random walk. Additionally, if a strong temporal pattern such as a linear trend exists, our random walk model cannot properly account for it. Although most of the estimates presented have enough statistical precision to allow sound conclusions about the effectiveness and safety of interventions, some estimates have wide 95% credible intervals, especially for serious adverse events.

Previous evidence

This network meta-analysis compares the effectiveness and safety of different doses and preparations of oral and topical NSAIDs, opioids, and paracetamol in a single analysis. We previously published a similar network meta-analysis that assessed the effectiveness of oral NSAIDs and paracetamol.33 With the ever increasing use of opioids in osteoarthritis treatment and the recommendation from recent guidelines to consider topical NSAIDs as a first line treatment, we expanded the previous review to also include these interventions to assess their comparative effectiveness and safety.67891618 We also included safety outcomes in the current review, which were not reported in our previous review. Recent guidelines suggest that when these interventions are used, the lowest possible dose should be prescribed to minimise the risk of adverse events.67 Safety outcomes presented alongside effectiveness outcomes allow physicians, patients, or their caregivers to have a better understanding of which preparations at their lowest doses would be safest while still being effective. Finally, the current review, with a literature search conducted in June 2021, provides an update of the evidence on the effectiveness of oral NSAIDs reported in our last review, which had a literature search conducted in February 2015.

The findings of our current and previous analyses indicate that diclofenac 150 mg/day and etoricoxib 60 mg/day seem to be the most effective NSAIDs for osteoarthritis pain. Previous studies indicate that etoricoxib has a similar risk to placebo and a lower risk than diclofenac of causing gastrointestinal adverse events, but a higher risk than placebo and a similar risk to diclofenac of cardiovascular adverse events.515253 Our dose specific analysis indicates that etoricoxib 60 mg/day, compared with diclofenac 150 mg/day, would lead to fewer people dropping out due to adverse events, a similar risk of any adverse events, and possibly a lower risk of serious adverse events. However, etoricoxib is not approved in the US where the FDA requires additional safety and efficacy data before deciding on its approval. In 2007, the FDA arthritis advisory committee voted 20 to 1 against the approval of etoricoxib given concerns about the drug’s increased risk of cardiovascular events and its association with worsening of hypertension.54 Data taken into account for this decision considered a possibly sixfold increase in the risk of cardiovascular events with etoricoxib compared with naproxen,54 which was not confirmed by a later network meta-analysis examining the cardiovascular safety of NSAIDs.51

Our findings suggest that rofecoxib 25 mg/day, a drug removed from the market because of cardiovascular safety concerns, is as effective as diclofenac and etoricoxib and with a similar safety profile, as corroborated by previous studies.3351 The cardiovascular safety of rofecoxib was first questioned by the VIGOR trial in 2000, which reported a significant increase in the risk of myocardial infarction in patients who received this drug compared with those who received naproxen.55 However, rofecoxib was only removed from the market in 2004. This decision was mainly based on the three year results of the APPROVe trial, which was a placebo controlled trial of rofecoxib for the prevention of recurrence of colorectal polyps in patients with a history of colorectal adenomas.5657 In this trial, patients were randomly allocated to receive one 25 mg tablet of rofecoxib each day (the maximum recommended long term daily dose) or a placebo tablet each day for three years. The results indicated an increased cardiovascular risk with long term (>18 months), fixed dose, daily use of rofecoxib.5657 No evidence was found of an increased cardiovascular risk in the first 18 months of use, but patients with a history of cardiovascular disease were not included in this trial.

Recent network meta-analyses have suggested that topical NSAIDs could be beneficial for osteoarthritis treatment.2858 However, these analyses included observational studies and small trials of low methodological quality, or did not provide treatment effect estimates separately for each drug preparation and dose. Our findings are based on high quality randomised trials and indicate that topical diclofenac 70-81 mg/day might be the best treatment in terms of effectiveness and safety. However, all diclofenac topical trials included only patients with knee osteoarthritis, so the evidence is unclear for hip osteoarthritis. Patients with osteoarthritis who require analgesic treatment but have not responded to first line treatments, such as topical NSAIDs, and have a contraindication to oral NSAIDs, might be prescribed opioids or paracetamol.6 Recent reviews indicate that both treatments do not have a clinically relevant effect on osteoarthritis symptoms and also raised safety concerns.11333459 Our analyses indicate that the small benefits of opioids or paracetamol might be outweighed by potential harms, regardless of dose. However, our findings are based on average estimates and it is possible that some of the patients who did not respond to other treatments could still benefit from opioids or paracetamol.6061 We have previously shown that no association exists between opioid dose and improvement in osteoarthritis pain.34 The findings of the current analysis corroborate this finding and also indicate that higher doses of opioids lead to more harm.

Implications for clinical practice

Physicians could use the results of our analysis to identify the lowest doses of different drug preparations that are effective and safe when first prescribing treatment, as generally recommended by current clinical practice guidelines.67 Treatment should preferably be on an as-needed basis, with intermittent short to mid-term use and varying doses as required, instead of a long term daily fixed dose. Our results indicate that lower doses of oral NSAIDs, such as oral diclofenac 100-105 mg/day and etoricoxib 30 mg/day, or topical diclofenac 70-81 mg/day, have more favourable safety profiles than maximum recommended daily doses, while still having >88% probability of treatment effects more pronounced than the MID. The potential benefits of these interventions must be carefully weighed against potential harms for each individual patient because the presence of comorbidities or long term use might increase the risk of serious adverse events.675162 For knee osteoarthritis, topical treatments are recommended before oral treatments because of their lower systemic exposure or toxicity.67 The 2019 guidelines of the Osteoarthritis Research Society International recommend topical NSAIDs for patients with knee osteoarthritis and gastrointestinal or cardiovascular comorbidities, or those who are frail because adverse events are minimal and mild; most are minor and transient local skin reactions.7 No recommendation was made for the use of topical NSAIDs in hip osteoarthritis, considering that the depth of the hip joint would make it unlikely for a benefit to occur. None of the trials included in our analysis investigated topical NSAIDs in patients with hip osteoarthritis. Oral NSAIDs received a conditional recommendation for knee and hip osteoarthritis without comorbidities, with cyclooxygenase 2 inhibitors recommended in the presence of gastrointestinal comorbidities.

Our findings indicate that opioids and paracetamol have a significantly smaller effect on knee or hip osteoarthritis pain and physical function, with an increased risk of harm associated with opioids and paracetamol compared with oral and topical NSAIDs. The evidence also indicates that the low effect of opioids on pain and physical function do not outweigh the harm they might cause in patients with osteoarthritis. The 2019 American College of Rheumatology/Arthritis Foundation guidelines conditionally recommended paracetamol and tramadol for patients with contraindications to oral NSAIDs or those who did not respond to previous treatment.6 Non-tramadol opioids were conditionally not recommended, however the guidelines recognise that they could be helpful at the lowest possible dose and for the shortest possible treatment duration in some patients for whom conservative alternatives have been exhausted. Our findings show that even when patients receive lower doses of non-tramadol opioids, they are up to 13 times more likely to interrupt treatment due to adverse events, and up to 10 times more likely to experience any type of adverse events compared with oral placebo.

Conclusions

Etoricoxib 60 mg/day and diclofenac 150 mg/day seem to be the most effective oral NSAIDs for osteoarthritis pain, but are probably not appropriate in the presence of comorbidities or for long term daily use given the mild increase in the risk of adverse events for both drugs. Additionally, an increased risk of dropping out due to adverse events was found for diclofenac 150 mg/day. Topical diclofenac 70-81 mg/day could be effective and safer due to reduced systemic exposure and lower dose, and should be considered as a first line pharmacological treatment for knee osteoarthritis. The clinical benefit of opioid treatment, regardless of preparation or dose, does not outweigh the harm it could cause in patients with osteoarthritis.

What is already known on this topic

  • Previous systematic reviews have reported on the effectiveness of non-steroidal anti-inflammatory drugs (NSAIDs) and opioids to treat osteoarthritis pain

  • These reviews clustered drug doses or drug classes in their analyses

What this study adds

  • Etoricoxib 60 mg/day and diclofenac 150 mg/day seem to be the most effective oral NSAIDs for knee and hip osteoarthritis pain and physical function, but might not be appropriate in the presence of comorbidities or for long term use

  • Topical diclofenac 70-81 mg/day could be effective and generally safer because of reduced systemic exposure and lower dose, and should be considered as first line pharmacological treatment for knee osteoarthritis

  • The clinical benefit of opioid treatment, regardless of preparation or dose, does not outweigh the harm it might cause in patients with osteoarthritis

Ethics statements

Ethical approval

This is a systematic review and network meta-analysis and does not require ethical approval.

Data availability statement

The guarantor (BRdC) is willing to examine all requests for the full dataset after a period of two years from the date of this publication.

Footnotes

  • Contributors: BRdC and TVP contributed equally to this work. BRdC and PJ conceived the idea for the review. BRdC and TVP designed, undertook the literature search, and coordinated the study. PS, MR, SMI, NSB, PB, LG, HDK, TM, MOA, CAH, RH gave crucial intellectual input and provided critical revision for the initial protocol and database building. NSB, SMI, and RH contributed to the implementation of the study. BRdC, PS, MR, SMI, NSB, PB, PSC, LG, HDK, TM, MOA, CAH, RH, and TVP acquired data, screened records, extracted data, and assessed risk of bias. BRdC coded the statistical analysis, figures, and appendix in collaboration with TVP. BRdC, TVP, PJ, AJS, GAH, PT analysed and interpreted the data. BRdC, PS, and TVP wrote the first draft of the manuscript. All authors gave crucial feedback on the revised report and approved the final version of the manuscript. BRdC and PJ obtained funding. BRdC, TVP, and PJ are the guarantors of this manuscript. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

  • Funding: This work was supported by the Arthritis Society (grant No YIO-17-0164) and by St Michael’s Hospital Foundation. PJ is a tier 1 Canada research chair in clinical epidemiology of chronic diseases. This research was completed, in part, with funding from the Canada Research Chairs Programme. AJS is partly funded by the Complex Reviews Support Unit which is funded by the National Institute for Health Research (project No 14/178/29). The views and opinions expressed herein are those of the authors and do not necessarily reflect those of the NIHR. PB is supported by the Arthritis Society Postdoctoral Fellowship Award with funding reference No 20-0000000016. TVP is funded by the Chevening Scholarship Programme (Foreign and Commonwealth Office, UK). The funders had no role in considering the study design or in the collection, analysis, interpretation of data, writing of the report, or decision to submit the article for publication.

  • Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/disclosure-of-interest/ and declare: support from the Arthritis Society, Canada Research Chairs Programme, National Institute for Health Research, Chevening Scholarship Program for the submitted work. PJ serves as unpaid member of the steering group of trials funded by Appili Therapeutics, Abbot Vascular, and Terumo; he has received research grants to the institution from Appili Therapeutics, and honorariums to the institution for participation in advisory boards or consulting from Amgen, Ava and Fresenius, but has not received personal payments by any pharmaceutical company or device manufacturer. AJS has been a paid consultant by Janssen-Cilag and GlaxoSmithKline. PT has received royalty payments as contributing author and editor for journals, textbooks and articles published by Elsevier, Little Brown, Wolters Kluwer, and John Wiley and Sons; PT has been an independent paid consultant to CHEOR Solutions (Canada), Innovative Science Solutions LLC and Reformulary Group; he serves as unpaid chair of the management subcommittee, of the executive committee of OMERACT; OMERACT receives unrestricted educational grants from the American College of Rheumatology, European League of Rheumatology, Amgen, Astra Zeneca, Bristol Myers Squibb, Celgene, Eli Lilly, Genentech/Roche, Genzyme/Sanofi, Horizon Pharma, Merck, Novartis, Pfizer, PPD, Quintiles, Regeneron, Savient, Takeda Pharmaceutical, UCB Group, Vertex, Forest and Bioiberica; PT serves as an independent committee member in Data Safety Monitoring Boards of FDA approved trials being conducted by UCB Biopharma GmbH and SPRL, Parexel International and Prahealth Sciences. All other authors report no financial relationships with any organisations that might have an interest in the submitted work in the previous three years. All authors report no other relationships or activities that could appear to have influenced the submitted work.

  • The lead author (BRdC) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as originally planned (and, if relevant, registered) have been explained.

  • Dissemination to participants and related patient and public communities: We will disseminate the results to clinicians, patient advocacy groups, and patient organisations, and also through press release and social media.

  • Provenance and peer review: Not commissioned; externally peer reviewed.

http://creativecommons.org/licenses/by-nc/4.0/

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

References

View Abstract