FormalPara Key Points

This comprehensive literature review and meta-analysis provides convincing evidence that topical non-steroidal anti-inflammatory drugs (NSAIDs) may be considered safe to use in the early treatment of OA.

The safety profile of topical NSAIDs is shown to be similar to that of placebo in randomized controlled trials; of particular importance is the low gastrointestinal toxicity, which makes the topical route preferable to oral administration.

Topical NSAIDs offer a favourable risk: benefit profile and may be safely used in combination with other treatment strategies for optimal management of OA.

1 Introduction

Osteoarthritis (OA) is a progressive, degenerative disorder, commonly affecting hand, knee and hip joints and causing considerable pain and disability, as well as reduced quality of life [1]. The incidence of OA is rising due to the aging population and the increase in obesity [1]. Topical non-steroidal anti-inflammatory drugs (NSAIDs) are widely recommended in national and international guidelines as an early option for the symptomatic management of OA [2,3,4,5,6]. For example, the European Society for Clinical and Economic Aspects of Osteoporosis, Osteoarthritis and Musculoskeletal Diseases (ESCEO) recommends topical NSAIDs as a step 1 pharmacological therapy for the management of knee OA [2], and the American College of Rheumatology (ACR) recommends topical NSAIDs for the initial management of hand or knee OA [4]. In addition, the ACR recommends that people aged ≥ 75 years should use topical rather than oral NSAIDs; older patients often have comorbidities and/or an increased risk of cardiovascular, gastrointestinal (GI) or renal adverse events (AEs) [4].

Topical NSAIDs are generally recommended ahead of oral NSAIDs or opioids for pain relief due to their superior safety profile. Topical NSAIDs have a small to moderate effect on pain in hip and knee OA, with effect size measured as 0.44 (95% confidence interval [CI] 0.27–0.62) [7]. In fact, the efficacy of topical NSAIDs is similar to that of oral NSAIDs but with a better safety profile due to lower systemic absorption [8]. Topical NSAIDs are associated with a lower risk of GI AEs and a higher risk of dermatological AEs compared with oral NSAIDs [8]. A systematic literature review of 16 randomized controlled trials (RCTs; with placebo and/or an active control) and 3 observational studies in older adults with OA found that while topical NSAIDs were associated with some safety issues, they were safer than oral NSAIDs. Based on data from the included RCTs, up to 39% of patients using a topical NSAID reported an application site AE, compared with 25% of patients receiving a vehicle or placebo. Likewise, up to 21% of patients using topical NSAIDs withdrew from the trials due to AEs, compared with 16% of those receiving placebo. This review also found that a substantial proportion of patients reported systemic AEs with topical NSAIDs, compared with placebo [9]. More recent Cochrane meta-analyses found that topical NSAIDs were significantly more effective than placebo for reducing pain due to chronic musculoskeletal conditions (largely from trials in patients with knee OA), with an increase in local AEs (mostly mild skin reactions) for diclofenac compared with placebo, but no increase for topical ketoprofen and no increase in serious AEs [10, 11].

To date, few meta-analyses have assessed the efficacy and safety of topical NSAIDs [10, 12, 13]. Those that have assessed safety used only published data, and it is well known that safety data are often underreported in manuscripts. The objective of this study was to assess the safety of topical NSAIDs in the management of OA in a systematic review and meta-analysis of randomized, placebo-controlled trials. In order to better estimate the safety profile of these OA medications, the authors of the manuscripts and/or the sponsors of the studies were contacted to obtain the full report of AEs.

2 Methods

The protocol of this systematic review and meta-analysis was previously registered in the PROSPERO database (registration number CRD42017058509). The systematic review was performed in accordance with the recommendations in the Cochrane Handbook for Systematic Reviews of Interventions [14], and the findings were reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [15]. The entire review process (study selection and risk of bias assessment) was undertaken using Covidence, the Cochrane platform for systematic reviews.

2.1 Eligibility Criteria

Randomized, double-blind, placebo-controlled, parallel-group trials that have assessed the AEs associated with topical NSAIDs in patients with OA were eligible for inclusion in this meta-analysis. The following studies were excluded: crossover studies, reviews or meta-analyses, letters, comments or editorials. Studies that allowed concomitant anti-OA medications during the trial (other than rescue medication such as paracetamol or aspirin) were also excluded, as were trials involving animal.

2.2 Data Sources and Search Strategies

A comprehensive literature search was undertaken in the MEDLINE (via Ovid), Cochrane Central Register of Controlled Trials (Ovid CENTRAL) and Scopus electronic databases. Each database was searched from inception until 1 August 2017. We searched for randomized, placebo-controlled trials of topical NSAIDs in OA, using a combination of study design-, treatment-, and disease-specific key words and/or Medical Subject Heading (MeSH) terms. While AEs were the outcomes of interest for this study, we decided to avoid the outcome-specific keywords in the search strategies because of the possibility that a study on the efficacy of a drug may have not mentioned terms related to AEs in its title, abstract or in the keywords sections. The search was limited to English and French publications and to human subjects. Detailed search strategies for the MEDLINE/CENTRAL and Scopus databases are reported in the Electronic Supplementary Material (ESM) 1.

Two clinical trials registries, ClinicalTrials.gov (clinicaltrials.gov/) and the World Health Organization’s International Clinical Trials Registry Platform Search portal (apps.who.int/trialsearch/), were also checked for trial results that would not have been published. Finally, very recent meta-analyses were also screened for any additional relevant studies. For all studies that responded to the selection criteria, the authors of the manuscripts and/or the sponsors of the studies were automatically contacted to obtain the full report of AEs, as long as there was some way to contact them (email, fax, telephone number, or co-author’s email in other articles).

We set up search alerts in the bibliographic databases for any new relevant RCTs that were published from 1 August 2017 to 30 September 2018.

2.3 Study Selection and Data Extraction

Two members of the review team (GH and VL) independently evaluated each title and abstract to exclude only obvious irrelevant studies, according to the predefined eligibility criteria. At this stage, the criteria related to adverse effects was not considered for selection as studies focusing on the efficacy of a treatment may not report data about adverse effects in the abstract, meaning that all trials mentioning only the efficacy information were retrieved at this stage. After this first step, the two investigators independently reviewed each of the full-texts of the articles not excluded during the initial screening stage, to determine whether the studies met all the selection criteria. Those that did not meet these criteria were definitely excluded. All differences of opinion regarding the selection of articles were resolved through discussion and consensus between the two investigators; any persistent disagreement was solved with the intervention of another member of the review team (VR). A flowchart of the number of included studies at each step was established, including the reasons for excluding studies during the full-text reading process.

The full-texts of the selected studies were screened for extraction of relevant data, using a standard data extraction form. Outcome results data were independently extracted by two members of the review team (GH and VL). For each study, the following data were extracted: characteristics of the manuscript, characteristics of the trial, objective and design of the study, characteristics of the patients, characteristics of the disease, characteristics of the treatments, AEs (outcomes) reported during the trial, and the main conclusion of the study. The raw data (number of events in each group) were extracted for each outcome. The number of patients who experienced any body system-related AE at least once (e.g. nervous system, GI system), as well as specific AEs within each body system (e.g. headache, abdominal pain), were extracted. As much as possible, data from the intention-to-treat (ITT) analysis were considered.

2.4 Assessment of Risk of Bias in the Included Studies

Two authors of the review team (GH and VL) independently assessed the risk of bias in each study using the Cochrane Collaboration’s tool for risk of bias assessment [14]. The following characteristics were evaluated:

  • Random sequence generation: We assessed whether the allocation sequence was adequately generated.

  • Allocation concealment: We assessed the method used to conceal the allocation sequence, evaluating whether the intervention allocation could have been foreseen in advance.

  • Blinding of participants and personnel: We assessed the method used to blind study participants and personnel from knowledge of which intervention a participant received and whether the intended blinding was effective.

  • Blinding of outcome assessment: We assessed the method used to blind outcome assessors from knowledge of which intervention a participant received and whether the intended blinding was effective.

  • Incomplete outcome data: We assessed whether participants’ exclusions, attrition and incomplete outcome data were adequately addressed in the paper.

  • Selective outcomes reporting: We checked whether there was evidence of selective reporting of AEs.

Each of these items was either categorized as ‘low risk of bias’, ‘high risk of bias’, or ‘unclear risk of bias’. ‘Low risk of bias’ or ‘high risk of bias’ was attributed for an item when there was sufficient information in the manuscript to judge the risk of bias as ‘low’ or ‘high’, otherwise ‘unclear risk of bias’ was attributed to the item. Disagreements were solved by discussion between the two reviewers during a consensus meeting, and involved, when necessary, another member of the review team (VR or AG) for final decision.

2.5 Outcomes of Interest

The main System Organ Classes (SOCs) that are likely to be affected by the use of topical NSAIDs in the treatment of OA were explored in this meta-analysis.

The following Medical Dictionary for Regulatory Activities (MedDRA) SOC-related AEs were defined as primary outcomes: GI, vascular, cardiac, nervous system, skin and subcutaneous tissue, and musculoskeletal and connective tissue, along with overall severe and serious AEs. Secondary outcomes were withdrawals due to AEs (i.e. the number of participants who stopped the treatment due to an AE), and total number of AEs (i.e. the number of patients who experienced any AE at least once).

2.6 Data Analysis

Analyses were performed using STATA 14.2 software. We described harms associated with the treatment as odds ratio (OR) with 95% CI, and computed an overall effect size for each primary or secondary outcome (AE). Anticipating substantial variability among trial results (i.e. the interstudy variability), we assumed heterogeneity in the occurrence of the AEs; thus, we planned to use random-effects models for the meta-analyses. We estimated the overall effects and heterogeneity using the DerSimonian and Laird random-effects model [16]. As this method provides biased estimate of the between-study variance with sparse events [17, 18], we also performed the meta-analyses using the restricted maximum likelihood (REML) method [19]. We reported only the results from the DerSimonian and Laird random-effects model as we found no difference in the effects computed by the two methods. We preferred reporting the results from the DerSimonian and Laird method (which uses a correction factor) because it allows for displaying studies with null event on the forest plot, even if those with a null event in both the intervention and control groups are excluded from the overall effect size computation. On the contrary, with the REML method, these studies are not displayed on the forest plot.

We tested heterogeneity using the Cochran’s Q test. As we were performing a random-effects meta-analysis, we used the Tau-squared (Tau2) estimate as the measure of the between-study variance. The I-squared (I2) statistic was used to quantify heterogeneity, measuring the percentage of total variation across studies due to heterogeneity [20]. In the case of substantial heterogeneity, we prespecified to undertake subgroup analyses, stratifying the analyses according to participants’ age in the intervention group, duration of the OA complaint, location of OA (knee, hand, hip), number of joints treated, formulation regimen of the treatment (cream, solution), drug dose, duration of the trial, nature of the comparator (placebo vs. carrier), and risk of bias in the studies (e.g. studies with a low risk of bias vs. all other studies).

We assessed funnel plot asymmetry for publication bias by visual inspection and using the Harbord test [21], which is more suitable for dichotomous outcomes, with effect sizes measured as OR [22], than the classical Egger’s test [23]. Finally, the certainty of each evidence was assessed using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach [24], and a table summarizing the findings was prepared using the GRADEpro online software [25].

3 Results

Database searches initially identified 1206 records; one additional article was identified by a manual search in MEDLINE, and data for two trials were provided by a pharmaceutical company (GlaxoSmithKline [GSK]). In fact, the two trials were referenced on ClinicalTrials.gov, but their results were only published in combination with the results of other trials as a post hoc analysis and pooled analysis [26, 27]. Consequently, these publications could not be included since they were assimilated to meta-analyses. However, GSK supplied us with the raw data for each of these two trials and we subsequently included them after evaluation against our selection criteria.

After exclusions based on titles and abstracts, 58 articles were screened in full against the selection criteria, with a further 33 studies being excluded for various reasons (Fig. 1). Twenty-five papers were included in the qualitative synthesis, and 19 studies with adequate data were ultimately included in the meta-analysis: 8 RCTs of diclofenac, 4 with ketoprofen, 3 with ibuprofen, and 1 study each on eltenac, piroxicam, nimesulide and S-flurbiprofen [28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50].

Fig. 1
figure 1

Study selection process

Table 1 presents the characteristics of the studies included through the systematic review process (those included in the quantitative synthesis [meta-analysis] are highlighted). Most of the studies included patients with knee OA; only two studies were conducted in patients with hand OA, and one study included patients with lumbar OA. Trial durations varied between 1 and 12 weeks. Of the three studies on topical ibuprofen, two were of 1-week duration and one lasted 2 weeks. Trial durations were 12 weeks for three of the four studies on topical ketoprofen. Few trials were specifically designed to compare oral and topical NSAIDs with placebo [35, 44, 46]. Most of the manuscripts retrieved did not adequately report AE data such that these could be used for a meta-analysis, or they did not provide all treatment-emergent AE data for all the randomized patients. The risk of selective outcome reporting bias was therefore judged as ‘high’ for more than 50% of the included studies. Figures 2 and 3 include a summary of the risk of bias assessed for each study included in the qualitative synthesis, as well as the risk of bias items presented as percentages across all these studies, except the two studies whose results were only published as post hoc and pooled analyses. In fact, we could not assess the risk of bias in the two studies for which we received only the raw data from their sponsor because we did not receive information on most of the risk of bias domains (neither were these published). In total, full safety data were provided by the authors of the manuscripts or the sponsors of the studies for 12 of the 19 trials included in the meta-analysis, substantially limiting the impact of the selective reporting bias on the results of this meta-analysis.

Table 1 Characteristics of the studies included through the systematic review process (studies grouped by drug; those included in the quantitative synthesis are highlighted in bold type)
Fig. 2
figure 2

Risk of bias summary: review authors’ judgements about each risk of bias item for each study included in the qualitative synthesis. Note: This figure does not include the two diclofenac NCT studies, as explained in Sect. 3

Fig. 3
figure 3

Risk of bias graph: review authors’ judgements about each risk of bias item presented as percentages across all studies included in the qualitative synthesis. Note: This figure does not include information on risk of bias for the two diclofenac NCT studies, as explained in Sect. 3; thus, the summaries made here are based on data from 23 studies

3.1 Results for all Topical Non-steroidal Anti-inflammatory Drugs (NSAIDs)

3.1.1 Primary Outcomes

The main SOCs that are more likely to be affected by the use of topical NSAIDs, as well as those that have been shown to be harmed by the use of oral NSAIDs, were considered as primary outcomes in this meta-analysis. Overall, we found no statistically significant increase in odds of AEs for topical NSAIDs versus placebo, for any of the SOCs considered. In particular, there was no statistically significant increase in odds, neither for GI disorders (OR 0.96, 95% CI 0.73–1.27; I2 = 0.0%) (Fig. 4) nor for vascular (OR 1.21, 95% CI 0.72–2.03) or cardiac disorders (OR 2.26, 95% CI 0.86–5.94). Although the odds for cardiac disorders appear to be higher in patients receiving topical NSAIDs than in those receiving placebo, the number of events in each group does not justify any concern.

Fig. 4
figure 4

Forest plot displaying the results of the meta-analysis comparing gastrointestinal disorders for all topical NSAIDs versus placebo in patients with osteoarthritis. NSAIDs non-steroidal anti-inflammatory drugs, CI confidence interval

Serious and severe AEs were also assessed as co-primary outcomes in this meta-analysis. Overall, there were no more serious (OR 0.79, 95% CI 0.37–1.71; I2 = 0%) or severe (OR 1.19, 95% CI 0.72–1.97; I2 = 10.9%) AEs in patients receiving topical NSAIDs than in those receiving placebo. Detailed results for all primary outcomes are provided in ESM 2.

3.1.2 Secondary Outcomes

Dropouts due to AEs, and the number of patients who experienced any AE at least once during the trials, were defined as secondary outcomes in this meta-analysis. Overall, there was a 16% increase in odds for total AEs with topical NSAIDs versus placebo, which was statistically significant (OR 1.16, 95% CI 1.04–1.29; I2 = 0%). Dropouts due to AEs were significantly more frequent in patients receiving topical NSAIDs than in those receiving placebo; the use of topical NSAIDs was associated with a near 50% increase in odds of withdrawals due to AEs compared with placebo (OR 1.49, 95% CI 1.15–1.92; I2 = 0%) (ESM 2).

3.2 Results for Individual Topical NSAIDs

3.2.1 Diclofenac

Eight studies involving topical diclofenac versus placebo were included in the analysis. Overall, there was a significant increase in AEs (total AEs) with topical diclofenac compared with placebo (OR 1.30, 95% CI 1.10–1.53; I2 = 0%). The rate of withdrawals due to AEs was significantly twice as high with topical diclofenac compared with placebo (OR 2.00, 95% CI 1.27–3.14; I2 = 0%). The higher rate of total AEs in patients receiving topical diclofenac seems to be driven by the higher odds for skin and subcutaneous tissue disorders with topical diclofenac compared with placebo, although the difference was not statistically significant (OR 1.73, 95% CI 0.96–3.10) (ESM 2).

There was no statistically significant difference in odds for severe (OR 1.19, 95% CI 0.68–2.07; I2 = 23.9%) or serious AEs (OR 0.94, 95% CI 0.26–3.42; I2 = 0%), or for specific SOC-related AEs, in patients who were treated with diclofenac compared with those who were receiving placebo. In particular, topical diclofenac was associated with no GI toxicity (OR 1.11, 95% CI 0.75–1.64; I2 = 0%) (Fig. 5).

Fig. 5
figure 5

Forest plot displaying the results of the meta-analysis comparing gastrointestinal disorders with topical diclofenac versus placebo in patients with osteoarthritis, CI confidence interval

3.2.2 Ketoprofen

Four studies involving topical ketoprofen versus placebo were included in the analysis. Overall, there was no difference in the rate of total AEs observed between topical ketoprofen and placebo (OR 1.04, 95% CI 0.90–1.20; I2 = 0%). A tendency for slightly more withdrawals due to AEs was observed with topical ketoprofen compared with placebo, but the OR did not reach statistical significance (OR 1.37, 95% CI 0.99–1.89; I2 = 0%) (ESM 2).

We found significantly fewer nervous system disorders reported with topical ketoprofen compared with placebo, with headache being the most frequently reported specific event in the placebo group (OR 0.60, 95% CI 0.41–0.88; I2 = 0%). Neither was there any statistically significant effect for events, including skin and subcutaneous tissue disorders, GI disorders (OR 0.78, 95% CI 0.51–1.21; I2 = 0%), cardiac or vascular disorders, musculoskeletal and connective tissue disorders, serious AEs, or severe AEs.

3.2.3 Ibuprofen

Three studies involving topical ibuprofen versus placebo were included in the analysis. No statistically significant effect was observed for a difference in the rate of AEs in all SOCs between ibuprofen and placebo, or for serious and severe AEs (ESM 2).

3.3 Assessment of Publication Bias

We assessed funnel plot asymmetry for publication bias, for each of the outcomes; only all ‘topical NSAIDs’ had sufficient studies for the Harbord test for funnel plot asymmetry. Only three studies on topical ibuprofen were available for the meta-analysis, with several null events; thus, there were insufficient data to perform the analyses for publication bias. Visual inspection of funnel plots and a formal test for funnel plot asymmetry (Harbord test) showed that there was no publication bias, whatever the outcome or treatment (Fig. 6 and ESM 3).

Fig. 6
figure 6

Assessment of publication bias: funnel plots for total adverse events with a all topical NSAIDs, b topical diclofenac, and c topical ketoprofen. (These funnel plots are based on the data used for the meta-analyses of ‘any AEs’ for each single NSAID or for all topical NSAIDs; these analyses were those including as much data as possible). NSAIDs non-steroidal anti-inflammatory drugs, AEs adverse events, OR odds ratio

3.4 GRADE Assessment of Findings

Using the GRADE approach [24], we assessed the certainty of evidence for each of the outcomes, for all topical NSAIDs and for individual topical NSAIDs, and, overall, found a ‘high’ certainty of evidence with most of the outcomes assessed. The high risk of outcome reporting bias found with topical ketoprofen downgraded the evidence to ‘moderate’ for most of the outcomes. The results for the main outcomes for all topical NSAIDs, topical diclofenac, and topical ketoprofen are depicted in the summary of findings tables (Tables 2, 3 and 4).

Table 2 Summary of findings for topical NSAIDs compared with placebo in patients with osteoarthritis
Table 3 Summary of findings for topical diclofenac compared with placebo in patients with osteoarthritis
Table 4 Summary of findings for topical ketoprofen compared with placebo in patients with osteoarthritis

4 Discussion

Overall, this meta-analysis found a small but statistically significant increase in odds of total AEs (+ 16%) for all topical NSAIDs compared with placebo, and a 49% increase in dropouts due to AEs, but no statistically significant effect in the individual SOCs investigated. In fact, where there were differences in odds between topical NSAIDs (overall or individual NSAIDs) and placebo for individual SOC analysis, these differences did not reach statistical significance. The frequency of serious AEs was lower with topical NSAIDs (overall) compared with placebo (− 21%), although severe AEs were reported more often (+ 19%); however, neither of these results was statistically significant. This evidence was associated with ‘high’ certainties, apart from the ‘cardiac disorders’ outcome, which was associated with a ‘moderate’ certainty of evidence due to a large imprecision around the combined effect size.

As reported hereunder, our study found a near 50% increased odds of withdrawal due to AEs with topical NSAIDs versus placebo (OR 1.49, 95% CI 1.15–1.92). This is in agreement with the results of a recent fixed-effect meta-analysis on the safety of topical NSAIDs versus placebo (in RCTs) (OR 1.56, 95% CI 1.21–2.00) [13]. The differences in the dropout rates between topical NSAIDs and placebo might be largely due to the high, but not statistically significant, odds for skin and subcutaneous tissue disorders in the treated group compared with placebo, mainly driven by topical diclofenac. However, the AEs that led participants to withdraw might be minor events as we found no statistically significant effect in terms of overall severe or serious AEs.

For SOC comparisons, moderate differences in the OR for AEs were found between all topical NSAIDs and placebo (skin + 12%, GI disorders − 4%, nervous system − 9%, vascular disorders + 21%), with the exception of cardiac disorders, which were increased more than twofold with topical NSAIDs versus placebo (OR 2.26, 95% CI 0.86–5.94), although the difference was not statistically significant. However, it is important to note that there should not be any major concerns in this regard because of the very high number of studies with null events (for cardiac disorders), both in the intervention and control groups, which explains the large imprecision around the overall effect estimate. In our study, the GI toxicity reported with topical NSAIDs was similar to that of placebo, and confirms earlier reports of a reduced risk of upper GI AEs with topical NSAIDs compared with oral NSAIDs [8, 10, 51].

A higher rate of skin reactions with topical NSAIDs has been reported in the literature, ranging from 10 to 39% [9, 52], which was not borne out by our analysis for all topical NSAIDs (OR 1.12, 95% CI 0.93–1.34). The increase in skin reactions may be product-specific. In fact, our results for topical diclofenac showed an increase in odds for skin and subcutaneous tissue disorders, although this was not statistically significant (OR 1.73, 95% CI 0.96–3.10); for topical ketoprofen and topical ibuprofen, such an increase was not observed. This is consistent with the Cochrane review of topical NSAIDs that found an increase in local skin AEs with topical diclofenac, but no increase with topical ketoprofen [10]. Although the difference in our study is not statistically significant (particularly regarding topical diclofenac), our estimation is more precise than previous analyses as we collected full safety data for most of the studies included in the analysis. A meta-analysis of nine RCTs on topical diclofenac found a higher incidence of AEs, including dry skin, rash, dermatitis and neck pain, and a higher incidence of withdrawals versus placebo [53].

As previously stated, we found no increased odds of skin and subcutaneous tissue disorders with topical ketoprofen, in placebo-controlled trials, as reported by Derry et al. [10]. On the contrary, a recent systematic review of five RCTs did find that the most commonly reported AE associated with the use of topical ketoprofen in transfersome gel was non-severe skin and subcutaneous tissue disorders (erythema) [54]; however, one of those five studies was an open-label study. Additionally, as this was not a meta-analysis, its results should be taken with caution, compared with those found from meta-analyses.

For individual topical NSAIDs, from the three studies included with ibuprofen, no statistically significant difference in the rate of AEs was observed compared with placebo. However, there is a need for further RCTs regarding topical ibuprofen in order to better estimate its safety profile.

With ketoprofen, data from four placebo-controlled RCTs showed no statistically significant difference in the rate of AEs, for total AEs (OR 1.04, 95% CI 0.90–1.20) and for all SOC-related AEs, with the exception of nervous system disorders. In fact, the data showed a reduced odds of nervous system disorders (− 40%), mainly headache, with ketoprofen compared with placebo (OR 0.60, 95% CI 0.41–0.88).

From the eight studies of diclofenac versus placebo included in the analysis, diclofenac was associated with a significant increase in odds of total AEs (+ 30%), and twice as many withdrawals due to AEs compared with placebo (OR 2.00, 95% CI 1.27–3.14). The number of AEs associated with diclofenac was largely driven by the increase in skin and subcutaneous tissue disorders (+73%), although the difference in odds versus placebo was not statistically significant (OR 1.73, 95% CI 0.96–3.10).

Long-term safety profiles of oral NSAIDs are different from safety profiles of short-term use [55,56,57]. In this meta-analysis, the longest trial duration for the included studies was 12 weeks (Table 1). While there was no heterogeneity associated with the overall OR for withdrawal due to AEs (I2 = 0%) and total AEs (I2 = 0%) with topical diclofenac versus placebo, we undertook subgroup analyses in order to investigate any treatment- or study-related characteristic effects. Our investigation regarding the effect of treatment duration on the rate of total AEs and dropouts due to AEs with topical diclofenac versus placebo suggested an increase in AE rates over time, notably for total AEs (ESM 4). However, evidence from an open-label, long-term safety trial (daily application of topical diclofenac sodium 1% gel for 9–12 months) [58], and from a post hoc analysis of studies assessing the long-term tolerability (12 months) of the same treatment in patients with OA, concluded that long-term use of topical diclofenac was safe in these patients [59]. Therefore, the long-term safety of topical NSAIDs deserves further investigation.

Finally, we also investigated if there were differences in AE rates (total AEs and dropouts due to AEs) with topical diclofenac versus placebo, according to the localization of OA, the type of topical formulation of the treatment and the daily dose, since these factors could influence the absorption and safety of topical treatments. Due to the very limited number of studies available for some subgroups (one to two studies) (ESM 4), we were unable to draw any definitive conclusion regarding these parameters. We would have also liked to compare the safety profile of topical diclofenac products containing dimethyl sulfoxide (DMSO) with that of the others, however none of the studies using topical diclofenac with DMSO were included in our analyses since they did not have adequate data for analysis (Table 1).

4.1 Strengths

Our meta-analysis included only RCTs of active treatment versus placebo, thus the real effect is not underestimated. Full safety data were obtained from the authors/sponsors of most of the studies, which allows for minimization of the risk of selective reporting bias. We reported on many SOCs, not only ‘total AEs’, ‘serious AEs’ or ‘skin AEs’, as in many of the previous meta-analyses. We avoided double counting of AEs. For each SOC, we considered the number of patients who experienced any related AE at least once, and, for total AEs (any AEs), we considered the number of patients who experienced any AE at least once during the study.

4.2 Limitations

Many studies identified that met the inclusion criteria did not provide AE data suitable for inclusion in the meta-analysis, and the authors/sponsors did not provide us with the full safety data. For some other studies included in our analyses, mainly regarding topical ketoprofen and ibuprofen, only published data were available, which limited our conclusions regarding the safety profiles of these compounds.

There is a unit-of-analysis error issue in this meta-analysis, except for individual meta-analyses on topical diclofenac and topical ibuprofen. A unit-of-analysis problem arises when, in studies with multiple arms, the same group of participants is included twice in the same meta-analysis (for example, when ‘dose 1 vs. placebo’ and ‘dose 2 vs. placebo’ are both included in the same meta-analysis, with the same original number of placebo patients in both comparisons) [14]. The Cochrane handbook proposes various approaches to include multiple groups from a single study in the same meta-analysis. For the current meta-analysis, one of these proposed methods was suitable, consisting of splitting the ‘shared’ group into two or more groups with a smaller sample size, and including two or more comparisons. However, we decided not to apply this method as we found that it only marginally and not significantly altered our results and did not modify our conclusions. Additionally, we wanted to let each comparison (active vs. placebo), with its real effect estimate and 95% CI, as if we chose to select only one pair of interventions.

5 Conclusions

This meta-analysis demonstrates that topical NSAIDs may be considered safe for the management of pain in OA patients, with no specific AEs found to be significantly more frequent with topical treatment compared with placebo. In particular, topical NSAIDs are associated with low GI toxicity, and, in this respect, may be preferred over the use of oral NSAIDs. Increases in skin and subcutaneous disorders observed with topical treatment may be product-specific as notably higher rates were observed with diclofenac, although with no statistically significant difference to placebo. Although non-significant, an increase in cardiac disorders was observed across all topical NSAIDs (except with topical ibuprofen), which may require further investigation. While previously demonstrating small to moderate efficacy in providing pain relief in OA, our findings confirm that topical NSAIDs are an important component of the treatment armamentarium for OA, and may be considered as safe to use early on in the management algorithm. Nonetheless, the long-term safety profile of topical NSAIDs deserves further investigation. Therefore, the use of topical NSAIDs in OA should be considered, taking into account their risk: benefit profile in comparison with other anti-OA treatments.