Intended for healthcare professionals

CCBYNC Open access
Research

# Effectiveness of management strategies for uninvestigated dyspepsia: systematic review and network meta-analysis

BMJ 2019; 367 (Published 11 December 2019) Cite this as: BMJ 2019;367:l6483
1. Leonardo H Eusebi, senior assistant professor1,
2. Christopher J Black, clinical research fellow2 3,
3. Colin W Howden, professor of medicine4,
4. Alexander C Ford, professor and honorary consultant gastroenterologist2 3
1. 1Department of Medical and Surgical Sciences, University of Bologna, Bologna, Italy
2. 2Leeds Gastroenterology Institute, St James’s University Hospital, Leeds, UK
3. 3Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, UK
4. 4Division of Gastroenterology and Hepatology, Department of Medicine, University of Tennessee College of Medicine, Memphis, TN, USA
1. Correspondence to: A C Ford Leeds Gastroenterology Institute, St James’s University Hospital, Leeds LS9 7TF, UK alexf12399{at}yahoo.com (or @alex_ford12399 on Twitter)
• Accepted 7 November 2019

## Abstract

Objective To determine the effectiveness of management strategies for uninvestigated dyspepsia.

Design Systematic review and network meta-analysis.

Data sources Medline, Embase, Embase Classic, the Cochrane Central Register of Controlled Trials, and clinicaltrials.gov from inception to September 2019, with no language restrictions. Conference proceedings between 2001 and 2019.

Eligibility criteria for selecting studies Randomised controlled trials that assessed the effectiveness of management strategies for uninvestigated dyspepsia in adult participants (age ≥18 years). Strategies of interest were prompt endoscopy; test for Helicobacter pylori and perform endoscopy in participants who test positive; test for H pylori and eradication treatment in those who test positive (“test and treat”); empirical acid suppression; or symptom based management. Trials reported dichotomous assessment of symptom status at final follow-up (≥12 months).

Results The review identified 15 eligible randomised controlled trials that comprised 6162 adult participants. Data were pooled using a random effects model. Strategies were ranked according to P score, which is the mean extent of certainty that one management strategy is better than another, averaged over all competing strategies. “Test and treat” ranked first (relative risk of remaining symptomatic 0.89, 95% confidence interval 0.78 to 1.02, P score 0.79) and prompt endoscopy ranked second, but performed similarly (0.90, 0.80 to 1.02, P score 0.71). However, no strategy was significantly less effective than “test and treat.” Participants assigned to “test and treat” were significantly less likely to receive endoscopy (relative risk v prompt endoscopy 0.23, 95% confidence interval 0.17 to 0.31, P score 0.98) than all other strategies, except symptom based management (relative risk v symptom based management 0.60, 0.30 to 1.18). Dissatisfaction with management was significantly lower with prompt endoscopy (P score 0.95) than with “test and treat” (relative risk v “test and treat” 0.67, 0.46 to 0.98), and empirical acid suppression (relative risk v empirical acid suppression 0.58, 0.37 to 0.91). Upper gastrointestinal cancer rates were low in all trials. Results remained stable in sensitivity analyses, with minimal inconsistencies between direct and indirect results. Risk of bias of individual trials was high; blinding was not possible because of the pragmatic trial design.

Conclusions “Test and treat” was ranked first, although it performed similarly to prompt endoscopy and was not superior to any of the other strategies. “Test and treat” led to fewer endoscopies than all other approaches, except symptom based management. However, participants showed a preference for prompt endoscopy as a management strategy for their symptoms.

Systematic review registration PROSPERO registration number CRD42019132528.

## Introduction

Dyspepsia is a common condition that could involve a variety of upper gastrointestinal symptoms, but the main symptom is upper abdominal pain or discomfort.1 At some point in their lives, one in five adults report epigastric pain, early satiety, postprandial distress, and other associated upper gastrointestinal symptoms, such as heartburn, regurgitation, or nausea. Although dyspepsia is not associated with higher mortality risk,23 the condition is chronic in many people4 and follows a fluctuating course.567 Dyspepsia has a substantial impact on patients’ quality of life,8 and is associated with more time off work and lower productivity at work, and greater medical and prescription drug costs each year.910 The financial implications for society as a whole are huge.11

Approximately 40% of people with dyspepsia symptoms will consult a primary care physician.12 The physician has to make a decision about how best to manage the individual patient. Patients with uninvestigated dyspepsia and alarm features, such as dysphagia, weight loss, or anaemia, or those older than a certain age threshold, require urgent endoscopy. However, the management of uninvestigated dyspepsia in the absence of alarm features represents a classic medical decision making problem because several strategies exist. These strategies include prompt endoscopy for all patients; test for Helicobacter pylori and perform endoscopy in those who test positive (“test and scope”); test for H pylori and eradication treatment in those who test positive (“test and treat”); empirical acid suppression for all patients; or symptom based management according to guideline recommendations or the physician’s usual practice.

The effectiveness of these different strategies has been studied in numerous pragmatic randomised controlled trials.1314151617 However, there is equipoise among various strategies and uncertainty as to which strategy is best to use first line. Trial based meta-analyses, and even individual patient data meta-analyses, have been unable to resolve this uncertainty completely. Although prompt endoscopy is expensive, it appears to be superior to empirical acid suppression or symptom based management when comparing the effect on symptoms in some patients,1518 and was superior to “test and treat” in an individual patient data meta-analysis.19 However, it is unlikely to be cost effective,19 and therefore is not recommended as first line treatment in management guidelines for uninvestigated dyspepsia.2021 Another individual patient data meta-analysis of “test and treat” versus empirical acid suppression showed no difference in either costs or effects between the two strategies.22 As a result, guidelines disagree about which approach should be used for the initial management of uninvestigated dyspepsia (table 1).202123

Table 1

Recommendations from previous guidelines on various initial management strategies for uninvestigated dyspepsia

View this table:

Network meta-analysis might be able to resolve some of this uncertainty because the methods used allow indirect and direct comparisons across different randomised controlled trials, which increases the number of participants’ data available for analysis. Additionally, network meta-analysis allows a credible ranking system to be developed that shows the effectiveness of different management strategies, even in the absence of trials making direct comparisons, which can help to inform clinical decision making. Therefore, we conducted a network meta-analysis of all available randomised controlled trials that have compared five management strategies for uninvestigated dyspepsia.

## Methods

### Search strategy and study selection

We searched Medline (from 1947 to September 2019), Embase, Embase Classic (from 1947 to September 2019), and the Cochrane Central Register of Controlled Trials to identify potential studies. In addition, we searched national guidelines for the management of dyspepsia, clinicaltrials.gov for unpublished trials, and supplementary data for potentially eligible studies (all up to September 2019). Conference proceedings (Digestive Disease Week, American College of Gastroenterology, United European Gastroenterology Week, and the Asian Pacific Digestive Week) between 2001 and 2019 were hand searched to identify studies published only in abstract form. Finally, we performed a recursive search by using the bibliographies of all obtained articles.

Eligible randomised controlled trials examined the effect of various management strategies for uninvestigated dyspepsia (prompt endoscopy, “test and treat,” “test and scope,” empirical acid suppression, or symptom based management) in adult participants (age ≥18 years). The definition of dyspepsia was broad and included any upper gastrointestinal symptoms referable to the gastroduodenum. We only considered randomised controlled trials to be eligible when they examined the effectiveness of one of the strategies of interest and compared it with at least one of the other strategies. Because dyspepsia is a chronic fluctuating condition,4 a minimum follow-up of 12 months was required. We extracted all endpoints at the final point of follow-up to ensure as much homogeneity as possible among individual trial results, and to avoid overestimating the effectiveness of one management strategy relative to another. Studies had to report a dichotomous assessment of symptom status at the final point of follow-up (box 1). The study protocol was published on the PROSPERO international prospective register of systematic reviews (registration number CRD42019132528).

Box 1

### Eligibility criteria

• Randomised controlled trials

• Uninvestigated dyspepsia, before first investigation

• Compared strategy of interest with at least one other strategy: prompt endoscopy, “test and scope,” “test and treat,” empirical acid suppression, or symptom based management

• Minimum follow-up duration of 12 months

• Dichotomous assessment of dyspeptic symptoms at minimum of 12 months

Two investigators (LHE and ACF) conducted the literature search independently from each other. We report the search strategy in the supplementary materials. There were no language restrictions. Two investigators (LHE and ACF) evaluated all abstracts identified by the search for eligibility, again independently from each other. We obtained all potentially relevant papers and evaluated them in more detail by using predesigned forms to assess eligibility independently, according to the predefined criteria. We translated foreign language papers if required. Disagreements between investigators were resolved by discussion.

### Outcome assessment

We assessed the effectiveness of all five management strategies of uninvestigated dyspepsia by comparing the probability of being symptomatic at the final point of follow-up. Additionally, because individual trials reported several other secondary endpoints, we were able to assess the likelihood of participants receiving endoscopy in each treatment arm, and dissatisfaction with management. Finally, we recorded rates of upper gastrointestinal cancer detection.

### Data extraction

Two investigators (LHE and ACF) extracted all data independently onto a Microsoft Excel spreadsheet (XP professional edition) as dichotomous outcomes (symptomatic or asymptomatic at final point of follow-up). For all included studies, we also extracted the following data for each trial, when available: country of origin, setting, duration of follow-up, age range of included participants, proportion of female participants, proportion of participants with H pylori infection, and exact management strategy used. Data were extracted as intention to treat analyses, with dropouts assumed to be treatment failures (that is, symptomatic at final point of follow-up), by using the total number of participants randomised to each treatment arm as the denominator, wherever trial reporting allowed. Given the duration of follow-up in individual trials, we also performed a sensitivity analysis by using a per protocol analysis and including all participants with reported evaluable data at the final point of follow-up.

### Quality assessment and risk of bias

This assessment was performed at the study level by two investigators (LHE and ACF) independently by using the Cochrane risk of bias tool.24 Disagreements were resolved by discussion. We recorded the methods used to generate the randomisation schedule and conceal treatment allocation. We also noted whether blinding was implemented for participants, personnel, and outcomes assessment, whether there was evidence of incomplete outcomes data, and whether there was evidence of selective reporting of outcomes.

### Data synthesis and statistical analysis

We performed a network meta-analysis by using the frequentist model with the statistical package “netmeta” (version 0.9-0, https://cran.r-project.org/web/packages/netmeta/index.html) in R (version 3.4.2). Firstly, we performed a pairwise meta-analysis of the raw data (supplementary figs 1-3) to convert them from contrast based format to long format, and to generate the treatment effect and standard error of the treatment effect for each pairwise treatment comparison. Subsequently, we used these data to conduct a network meta-analysis by using netmeta, which assumes a common τ2 for all pairwise comparisons. The estimate of τ2 is based on the generalised DerSimonian-Laird method.25 Uncertainty is not accounted for fully in this model because the distribution of parameters such as the between study variance is not assumed. In multiarm studies, all pairwise comparisons are considered, not only those with a common comparator, but are downweighted.25 We reported the network meta-analysis according to the PRISMA (preferred reporting items for systematic reviews and meta-analyses) extension statement for network meta-analyses.26 Network meta-analysis results usually give a more precise estimate compared with results from standard, pairwise analyses,2728 and can rank management strategies to inform clinical decisions.29

We examined the symmetry and geometry of the evidence by producing a network plot with node and connection size corresponding to the number of study participants and number of studies, respectively. We produced comparison adjusted funnel plots to explore publication bias or other small study effects for all available comparisons by using Stata (version 14, Stata, College Station, TX). This is a scatterplot of effect size versus precision, measured through the inverse of the standard error. Symmetry around the effect estimate line indicates the absence of publication bias, or small study effects.30 We produced a pooled relative risk with 95% confidence interval to summarise the effectiveness of each management strategy tested by using a random effects model as a conservative estimate. We used the relative risk of remaining symptomatic at the final point of follow-up; when the relative risk is less than one and the 95% confidence interval does not cross one, there is a substantial benefit of one management strategy over another. Because there were direct comparisons between all of the management strategies, we were able to perform consistency modelling to check the agreement between direct and indirect evidence.31

Many meta-analyses use the I2 statistic to measure heterogeneity, which ranges between 0% and 100%.32 This statistic is easy to interpret and does not vary with the number of studies. However, the I2 value can increase with the number of patients included in the meta-analysis.33 Therefore, we assessed global statistical heterogeneity across all comparisons using the τ2 measure from the netmeta statistical package. Estimates of τ2 of approximately 0.04, 0.16, and 0.36 are considered to represent a low, moderate, and high degree of heterogeneity, respectively.34 We assessed inconsistency in the network analysis by comparing direct and indirect evidence, when available, by producing a network heat plot.3135 These plots have grey squares, which represent the size of the contribution of the direct estimate in columns, compared with the network estimate in rows.35 The coloured squares around these represent the degree of inconsistency, with red squares indicating “hotspots” of inconsistency. We planned to remove studies that introduced any red “hotspots” and to repeat the analyses to investigate sources of potential inconsistency. We also applied the χ2 test of the Q statistic to test for inconsistency, under the assumption of a full design by treatment interaction random effects model.3536 Finally, we tested for local inconsistency by splitting the network estimates into the contribution of direct and indirect evidence, and looking for any statistically significant differences.

We ranked management strategies according to their P score, which is between 0 and 1. P scores are based solely on the point estimates and standard errors of the network estimates, and measure the mean extent of certainty that one management strategy is better than another, averaged over all competing strategies.37 Higher scores indicate a greater probability of the strategy being ranked as best,37 but the magnitude of the P score should be considered in addition to the rank. Because the mean P score is always 0.5, individual strategies that cluster around this score are likely to be of similar effectiveness. However, when interpreting the results, it is also important to take into account the relative risk and corresponding 95% confidence interval for each comparison, rather than relying on rankings alone.38 In our primary analysis, we pooled data for the risk of being symptomatic at the final point of follow-up in each study for all included randomised controlled trials by using an intention to treat analysis. We also performed a per protocol analysis, and conducted analyses of the likelihood of receiving endoscopy, dissatisfaction with management among participants, and rates of upper gastrointestinal cancer.

We compared the relative effectiveness of all five management strategies using the “NetMetaXL” tool running in WinBUGS (version 1.4, Imperial College and MRC, London),39 which uses Bayesian methods. We used a random effects model with vague (uninformative) priors to achieve a conservative estimate of relative efficacy. Strategies were ranked according to their surface under the cumulative ranking curve value, which is comparable to the P score used in the frequentist model of our primary analyses.37 There were no differences in rankings among approaches, and therefore, for clarity, we only report the frequentist model in this paper, which is consistent with our approach for reporting previously published network meta-analyses.4041424344

Because one of the studies was a cluster randomised trial,45 with patients assigned to treatment strategy by primary care practice, rather than randomised individually, we used the cluster size and the intra cluster correlation coefficient to reduce the size of the trial to its “effective sample size,” which was 440 participants (233 “test and treat” and 207 empirical acid suppression), before any data pooling was carried out.46 If clustering is ignored, a “unit of analysis error” can occur,47 which will overestimate the effect of the intervention in the study, and also mean the study’s weight in the meta-analysis is artificially high.

### Patient and public involvement

This was a network meta-analysis of previously published randomised controlled trials. It was not possible for us to involve patients or the public in defining the research question, the design, or the evaluation and discussion of our work. We will disseminate our findings in lay terms through the national charity for people living with digestive diseases, “Guts UK.”

## Results

The search strategy generated 8781 citations, 59 of which we retrieved for further assessment because they appeared to be relevant (supplementary fig 4). Of these, 44 were excluded for various reasons, which left 15 eligible randomised controlled trials that comprised 6162 participants. Fourteen trials were fully published,1314151617184548495051525354 and data from another trial were available from a previous individual patient data meta-analysis conducted by our group.19 Agreement between investigators for trial eligibility was excellent (κ statistic=0.91). Supplementary table 1 reports risk of bias items for all included trials. Because the trials were all pragmatic, with blinding of participants impossible because of the differences in the strategies used, none was at low risk of bias.

Table 2 presents detailed characteristics of individual randomised controlled trials and the comparisons made. Six randomised controlled trials compared prompt endoscopy with “test and treat”171948495051; three “test and treat” with empirical acid suppression144552; two prompt endoscopy with empirical acid suppression1353; one prompt endoscopy with symptom based management15; one “test and scope” with symptom based management16; one prompt endoscopy with empirical acid suppression or symptom based management54; and one prompt endoscopy with “test and scope,” “test and treat,” or empirical acid suppression.18 Direct evidence was therefore available for nine of the 10 possible comparisons. All trials were of 12 months’ duration, with the exception of two randomised controlled trials in which the final point of follow-up was 18 months.1516

Table 2

Characteristics of randomised controlled trials of management strategies for uninvestigated dyspepsia

View this table:

### Effectiveness

#### Intention to treat analysis

All 15 randomised controlled trials provided dichotomous data for likelihood of remaining symptomatic at the final point of follow-up.131415161718194548495051525354 In these 15 trials, 1942 participants were randomised to prompt endoscopy, 484 to “test and scope,” 1938 to “test and treat,” 1329 to empirical acid suppression, and 469 to symptom based management. Figure 1 presents the network plot. When data were pooled, there was little observed heterogeneity (τ2=0.007), and no evidence of publication bias or other small study effects (supplementary fig 5). Of the five strategies, “test and treat” was ranked first (relative risk of remaining symptomatic 0.89, 95% confidence interval 0.78 to 1.02, P score 0.79; fig 2). The network heat plot had no red “hotspots” of inconsistency (supplementary fig 6), and there was no evidence of inconsistency under the full design by treatment interaction model after applying the χ2 test of the Q statistic (1.91, P=0.93). The netsplit analysis did not identify any significant differences between the direct and indirect treatment effect estimates for any of the treatment comparisons (supplementary table 2). None of the strategies was significantly less effective than “test and treat,” or more effective than each other, on either direct or indirect comparison (fig 3). Prompt endoscopy was ranked second, but performed similarly to “test and treat” (relative risk of remaining symptomatic 0.90, 95% confidence interval 0.80 to 1.02, P score 0.71). This means that the probability of “test and treat” or prompt endoscopy being the most effective strategy when all five management strategies, including symptom based management, were compared with each other was 79% and 71%, respectively. In contrast, the probability of “test and scope,” empirical acid suppression, or symptom based management being the most effective strategy was 57%, 30%, and 12%, respectively.

Fig 1

Network plot for likelihood of remaining symptomatic according to intention to treat analysis at final point of follow-up

Fig 2

Forest plot for likelihood of remaining symptomatic according to intention to treat analysis at final point of follow-up. P score is probability of each treatment being ranked as best in network analysis. Higher score indicates greater probability of being ranked first

Fig 3

Summary treatment effects from network meta-analysis for likelihood of remaining symptomatic according to intention to treat analysis at final point of follow-up. Comparisons, column versus row, should be read from left to right, and are ordered relative to overall effectiveness. Treatment in top left position is ranked as best after network meta-analysis of direct and indirect effects. Direct comparisons are provided above strategy labels, and indirect comparisons are below. Values are relative risk (95% confidence interval). NA=not applicable, no randomised controlled trials making direct comparisons

Two of the trials of “test and treat” versus prompt endoscopy recruited only participants with H pylori infection,1948 and one of the trials of prompt endoscopy versus empirical acid suppression used ranitidine,13 rather than a proton pump inhibitor. Therefore, we excluded these three trials in a retrospective sensitivity analysis so as not to overestimate the effectiveness of “test and treat,” or underestimate the effectiveness of empirical acid suppression. When data were pooled, there was little observed heterogeneity (τ2=0.007). “Test and treat” was ranked first (relative risk 0.89, 95% confidence interval 0.77 to 1.02, P score 0.82) and prompt endoscopy second (0.90, 0.79 to 1.02, P score 0.70). When we excluded the two trials of 18 months’ duration, the overall results were not affected1516; “test and treat” was still ranked first and prompt endoscopy second.

#### Per protocol analysis

All 15 randomised controlled trials provided dichotomous data for likelihood of remaining symptomatic at the final point of follow-up according to a per protocol analysis.131415161718194548495051525354 In this analysis, there were data on 5154 participants, of whom 1667 were randomised to prompt endoscopy, 326 to “test and scope,” 1689 to “test and treat,” 1150 to empirical acid suppression, and 322 to symptom based management. Supplementary fig 7 presents the network plot. Again, when data were pooled, there was little observed heterogeneity (τ2=0.009), and no evidence of publication bias or other small study effects (supplementary fig 8). There were no red “hotspots” of inconsistency on the network heat plot (supplementary fig 9), with no evidence of inconsistency under the full design by treatment interaction model after applying the χ2 test of the Q statistic (1.28, P=0.97). The netsplit analysis did not identify any significant differences between the direct and indirect treatment effect estimates for any of the treatment comparisons (supplementary table 3). Once again, “test and treat” was ranked first (relative risk 0.87, 95% confidence interval 0.74 to 1.03, P score 0.79; supplementary fig 10), but was not superior to any of the other four strategies, and none of the strategies was more effective than any of the others on direct or indirect comparison (supplementary table 4). The P scores for prompt endoscopy, “test and scope,” empirical acid suppression, or symptom based management were 0.69, 0.63, 0.26, and 0.13, respectively. As before, when we excluded the three aforementioned trials in a retrospective sensitivity analysis,131948 there was little observed heterogeneity (τ2=0.007), and “test and treat” was ranked first (relative risk 0.87, 95% confidence interval 0.73 to 1.02, P score 0.81), with prompt endoscopy second (0.88, 0.76 to 1.03, P score 0.68). Again, excluding the two trials of 18 months’ duration did not affect the overall results1516; “test and treat” was still ranked first and prompt endoscopy second.

### Rates of endoscopy

Fourteen randomised controlled trials that comprised 5897 participants, provided data on the number of participants in each arm undergoing endoscopy.1314151617181945484950515253 Supplementary fig 11 presents the network plot. When data were pooled, there was a moderate level of statistical heterogeneity (τ2=0.16), but no evidence of publication bias or other small study effects (supplementary fig 12). The network heat plot had no red “hotspots” of inconsistency (supplementary fig 13), and there was no evidence of inconsistency under the full design by treatment interaction model after we applied the χ2 test of the Q statistic (2.56, P=0.63). The netsplit analysis did not identify any significant differences between the direct and indirect treatment effect estimates for any of the treatment comparisons (supplementary table 5). Of the five strategies, “test and treat” was ranked first (relative risk of receiving endoscopy 0.23, 95% confidence interval 0.17 to 0.31, P score 0.98; fig 4). When we performed an indirect comparison we found participants allocated to “test and treat” were significantly less likely to receive endoscopy than those in any of the other management strategies, except symptom based management. Participants assigned to all four other strategies were significantly less likely to receive endoscopy than those randomised to prompt endoscopy (fig 5). When we performed a direct comparison, we found participants randomised to “test and treat” or empirical acid suppression were significantly less likely to receive endoscopy than those assigned to prompt endoscopy.

Fig 4

Forest plot for likelihood of receiving endoscopy. P score is probability of each treatment being ranked as best in network analysis. Higher score indicates greater probability of being ranked first

Fig 5

Summary treatment effects from network meta-analysis for likelihood of receiving endoscopy. Comparisons, column versus row, should be read from left to right, and are ordered relative to overall effectiveness. Treatment in top left position is ranked as best after network meta-analysis of direct and indirect effects. Orange boxes indicate significant differences. Direct comparisons are provided above strategy labels, and indirect comparisons are below. Values are relative risk (95% confidence interval). NA=not applicable, no randomised controlled trials making direct comparisons

### Participant dissatisfaction with management

Only six trials that comprised 2818 participants reported rates of satisfaction with management according to strategy131845495051; no randomised controlled trials reported on satisfaction with symptom based management. Supplementary fig 14 presents the network plot. The term “risk” has negative connotations; therefore, in this analysis, we chose to extract data as rates of dissatisfaction with management, such that the best performing strategy has the lowest risk of dissatisfaction (rather than the highest risk of being satisfied). When data were pooled, there was a moderate level of statistical heterogeneity (τ2=0.13), and too few randomised controlled trials to assess for evidence of publication bias, or other small study effects. Of the four strategies, prompt endoscopy was ranked first (relative risk of being dissatisfied 0.58, 95% confidence interval 0.37 to 0.91, P score 0.95; supplementary fig 15). Participants allocated to prompt endoscopy were significantly less likely to be dissatisfied with management compared with participants randomised to “test and treat” or empirical acid suppression, on indirect comparison, and with empirical acid suppression on direct comparison (supplementary table 6). The netsplit analysis did not identify any significant differences between the direct and indirect treatment effect estimates for any of the treatment comparisons (supplementary table 7). However, the network heat plot revealed a red “hotspot” of potential inconsistency (supplementary fig 16), with evidence of inconsistency under the full design by treatment interaction model after applying the χ2 test of the Q statistic (26.07, P<0.001). This was driven by one early study of prompt endoscopy versus empirical acid suppression,13 which showed substantially higher rates of dissatisfaction with empirical acid suppression. Rerunning the network without this trial resolved the inconsistency (Q statistic 1.73, P=0.42) and reduced heterogeneity (τ2=0.002), but did not change the ranking of prompt endoscopy (relative risk 0.85, 95% confidence interval 0.70 to 1.02, P score 0.97).

### Rates of upper gastrointestinal cancer detection

Eleven randomised controlled trials reported upper gastrointestinal cancer detection rates among 5028 participants.1314151617184849505152 In total, 20 (0.40%) cancers were detected: 11 (0.67%) among 1644 participants undergoing prompt endoscopy; four (0.24%) among 1672 participants allocated to “test and treat”; two (0.41%) among 484 participants assigned to “test and scope”; two (0.24%) among 849 participants randomised to empirical acid suppression; and one (0.26%) among 379 participants given symptom based management. Cancer location and type were provided for 16 participants; gastric adenocarcinoma occurred in 12 people, gastric lymphoma in two people, and oesophageal carcinoma in two people. Both participants with oesophageal carcinoma had previously reported dysphagia, and arguably, were recruited inappropriately to the relevant trial.18

## Discussion

### Principal findings

This systematic review and network meta-analysis has shown that “test and treat” might be the most effective first line strategy for the management of uninvestigated dyspepsia in primary care, in terms of effect on symptoms, although prompt endoscopy performed similarly in this respect. However, no strategy was significantly less effective than “test and treat,” or more effective than another strategy, on either direct or indirect comparison. “Test and treat” was significantly more likely to reduce use of endoscopy compared with all strategies, other than symptom based management. Despite this, rates of dissatisfaction with management were significantly lower among participants allocated to prompt endoscopy compared with “test and treat” or empirical acid suppression on indirect comparison. Finally, detection rates of upper gastrointestinal cancer in these trials were extremely low.

### Strengths and limitations of the study

The network meta-analysis allowed us to make indirect comparisons among over 6000 participants in 15 randomised controlled trials. The trials themselves were pragmatic and recruited participants from primary care, or on first referral to secondary care, which meant the results of our study are likely to be generalisable to other patients who present with dyspepsia in this setting. We used the most stringent endpoint for effect on symptoms in all trials, and only classified participants who were completely asymptomatic as having reached the endpoint of interest. We used an intention to treat analysis, with all trial dropouts assumed to be symptomatic. Because of the length of follow-up in individual randomised controlled trials, we performed a sensitivity analysis by using a per protocol analysis. We also excluded two randomised controlled trials that examined effectiveness of prompt endoscopy versus “test and treat” only in participants who were H pylori positive1948 and one that used empirical ranitidine as an acid suppressant13 in a separate retrospective sensitivity analysis. Our findings remained unchanged in this analysis. Finally, we produced network heat plots and identified inconsistency in one of our analyses, but this was resolved when we excluded one study that reported a large difference in dissatisfaction rates between prompt endoscopy and empirical acid suppression.

Our study had several limitations. We did not have access to individual patient data for the network meta-analysis, which meant that we were unable to study the effects of the various management strategies on other dyspepsia related resource use, or total costs of managing dyspepsia. There were also differences between individual trials in the population studied, study setting, the way the intervention was applied, duration of follow-up, and endpoint used to define symptom response; therefore, it might not be appropriate to combine data from these trials in a meta-analysis. However, we only classed those as entirely asymptomatic as having reached the endpoint of interest, and we performed sensitivity analyses based on some of these study characteristics, and our results were unchanged. These differences could explain the moderate amounts of heterogeneity in some of our analyses.

Additionally, only six randomised controlled trials contributed data to the analysis of dissatisfaction with management,131845495051 and four of these compared prompt endoscopy with “test and treat,”18495051 which meant that the findings from this analysis might not be as robust for the other strategies. We were not able to examine the effect of these management strategies on quality of life in the network meta-analysis because the included studies used a variety of instruments, disease specific and generic; they also reported these analyses in a multitude of ways, which precluded pooling of the data. Because most studies were conducted in western populations, with only one randomised controlled trial conducted in Malaysia,51 our findings cannot be extrapolated to the Far East where the incidence of gastric cancer is higher, and prompt endoscopy might therefore be more appropriate. Finally, all of the included randomised controlled trials were at high risk of bias because of their pragmatic design, which meant that blinding of participants was not possible.

### Comparison with other studies

We believe this network meta-analysis is an advance over previous meta-analyses in this field for several reasons. Our meta-analysis produces a credible ranking system for each strategy, rather than relying on summary relative risks of comparative effectiveness of one strategy over another. This approach is clinically useful given that in previous individual patient data meta-analyses there was only a small, although statistically significant, difference in the relative risk of remaining symptomatic, which favoured prompt endoscopy over “test and treat.”19 Additionally, no difference was found between “test and treat” and empirical acid suppression.22 Furthermore, more randomised controlled trials have been included in our analysis than in the aforementioned individual patient data meta-analyses,1922 and a previous trial based meta-analysis.55 Our meta-analysis was also able to make indirect and direct comparisons by using trial data, which led to a change in the strategy that could be the most effective for first line management of uninvestigated dyspepsia. “Test and treat” was ranked above prompt endoscopy, although P scores were similar. Additionally, the equipoise between “test and treat” and empirical acid suppression was no longer seen, with a P score for “test and treat” of 0.79 compared with 0.30 for empirical acid suppression. Although National Institute for Health and Care Excellence guidance states that either “test and treat” or empirical acid suppression can be used first line for uninvestigated dyspepsia,21 the “test and treat” strategy was recommended first line for patients younger than 60 by the more recent American College of Gastroenterology and Canadian Association of Gastroenterology joint practice guideline on dyspepsia.20 The results of our network meta-analysis support the latter recommendation. Finally, a previous trial based meta-analysis combined participants in the empirical acid suppression and symptom based management arms.55 Because those receiving symptom based management did not receive standardised proton pump inhibitor dosing in these randomised controlled trials, we believe this is technically incorrect, and might have led to an underestimate of the effectiveness of empirical acid suppression.

Dyspepsia, however defined, is a frequent reason for consultation with primary care providers and gastroenterologists. Dyspeptic symptoms can cause substantial anxiety for patients who might fear that they have a serious underlying condition to account for their symptoms. This anxiety is despite the fact that upper gastrointestinal malignancy is identified at endoscopy in less than 1% of patients.56 Of note, the rate of upper gastrointestinal malignancy in this meta-analysis was only around 0.4%, which suggests that, among 1000 patients presenting to a primary care provider with uninvestigated dyspepsia, 996 would be cancer free on endoscopy. This makes the strategy of prompt endoscopy for the evaluation of uninvestigated dyspepsia highly questionable, at least in the absence of potentially important alarm features that, arguably, would merit endoscopy in any case. A previous study in the USA estimated that the cost of detecting one upper gastrointestinal cancer in patients aged 50 or older with dyspepsia without alarm features in primary care was over $80 000 (£62 000; €72 300).57 Prompt endoscopy might be justifiable on the basis of providing reassurance to patients about the absence of a sinister underlying cause of their symptoms, but studies suggest this effect is relatively short lived.58 Although “test and treat” was ranked first in terms of effectiveness, and significantly limited the use of endoscopy, we found lower levels of dissatisfaction with management among participants randomised to prompt endoscopy. Given the lack of blinding in all of the included studies, this could relate to patients’ previous expectations of management, which were not met if assigned to a non-invasive management strategy; patients might prefer endoscopy for what they consider to be a more thorough and appropriate evaluation of their symptoms. Furthermore, the impact of negative findings at endoscopy probably also influences patients’ satisfaction with this approach. We must weigh these potential benefits of endoscopy against its substantial costs and the risk of adverse events, albeit small. Because endoscopy is the principal driver of overall costs in the management of dyspepsia, it cannot be supported based on cost effectiveness. This consideration is underlined by a previous individual patient data meta-analysis of prompt endoscopy versus “test and treat,”19 which estimated that prompt endoscopy was only cost effective if the willingness to pay per patient cured of their dyspepsia was$180 000. Endoscopy can identify H pylori infection, although it is not essential for that purpose because there are widely available, cheaper, non-invasive tests for active infection with excellent performance characteristics. However, the American Gastroenterological Association has recommended routine collection of gastric biopsies at endoscopy when performed in patients with dyspepsia to document the presence or absence of H pylori infection59; however, treatment of the infection leads to sustained symptom improvement in only a minority of patients.60

### Conclusions and policy implications

The strategy of “test and treat” has proved popular in many countries. H pylori infection is usually asymptomatic, but it can lead to dyspepsia even in the absence of peptic ulcer disease.61 The non-invasive detection of H pylori infection with a reliable test, such as the urea breath test or faecal antigen test, should lead automatically to treatment for the infection. The “test and treat” strategy would detect most patients with dyspepsia and underlying peptic ulcer disease, although it would not identify them individually; they would certainly benefit from eradication of the infection. However, most patients with H pylori infection would not have peptic ulcer disease and many would fulfil diagnostic criteria for functional dyspepsia. Eradication of the infection would produce sustained improvement in only a minority of these patients, but it would remove a potentially serious cause of disease in the remainder. Population screening and treatment for H pylori appears to reduce future dyspepsia related costs in the West,62 and also reduces incidence of gastric cancer in high risk populations,63 so there are probably other benefits from more widespread use of “test and treat.” That said, many of the trials included in the network meta-analysis were conducted more than 15 years ago, and the prevalence of H pylori infection might have declined in Western populations during this time. A simulation model of the cost effectiveness of management strategies for uninvestigated dyspepsia in the USA suggested that “test and treat” was unlikely to remain cost effective below a prevalence of infection of 20%, although the confidence intervals were wide.64

Symptom based management was ranked the lowest of all the strategies when considering effectiveness. Management of dyspepsia with drug treatments is unsatisfactory and often lacks an adequate evidence base because the underlying causes of symptoms are poorly understood. This makes targeted drug interventions empirical at best. Patients with dyspepsia might be treated with a variety of drugs, depending on local availability and approval, physicians’ personal experience, and to some extent, on assessment of an individual patient’s symptom profile. Recent guidelines recommend the use of empirical proton pump inhibitor treatment for patients younger than 60 in whom “test and treat” is unsuccessful and in those without H pylori infection.20 Although acid suppression with a proton pump inhibitor might be effective for some patients,65 their long term efficacy is unclear and the optimal duration of treatment is not defined. In patients whose dyspeptic symptoms do not respond to a proton pump inhibitor, there is no value in continuing with this treatment. Furthermore, recent concerns about the long term safety of these drugs, although often based on weak evidence,66 could have altered perceptions of their appropriateness for the long term management of dyspepsia. Additional drug interventions that could be used for the management of dyspepsia include drugs with presumed prokinetic effects67 and neuromodulators, including tricyclic antidepressants.68 The role of prokinetic agents is limited because of their lack of availability in many countries. Neuromodulators have an important role in the management of dyspepsia and other functional gastrointestinal disorders.686970 However, the decision to use any of these drugs, and the order in which they might be tried, is based on choices made by individual physicians and patients, and to some extent is influenced by the factors listed here. Therefore, it is perhaps unsurprising that this largely empirical strategy was the least effective.

In summary, dyspepsia continues to be a highly prevalent condition that can influence quality of life profoundly and accounts for major healthcare expenditures. Many different management strategies have been studied in individual randomised controlled trials. This network meta-analysis provides additional support for the so called “test and treat” approach in management. This strategy, recently recommended in national guidelines,20 was consistently associated with the lowest chance of remaining symptomatic and with the lowest use of endoscopy. Therefore, it is probably of benefit in reducing overall costs, at least in some healthcare delivery models. However, despite the low diagnostic yield of endoscopy in detecting upper gastrointestinal tract malignancy, it might be the strategy most preferred by patients. Management of patients with dyspepsia should continue to be based on best evidence, but should also take into account the nuances of the individual patient within the confines of the healthcare setting.

#### What is already known on this topic

• Dyspepsia is a highly prevalent and costly condition

• Many management approaches have been compared in pragmatic randomised controlled trials, and summarised in individual patient data meta-analyses, but there is equipoise between strategies

• Guidelines disagree about which approach should be used for the initial management of uninvestigated dyspepsia

• This network meta-analysis found “test and treat” was ranked first, although it performed similarly to prompt endoscopy and was not superior to any of the other strategies

• “Test and treat” led to fewer endoscopies than all other strategies except symptom based management

• Participants showed a preference for prompt endoscopy as a management strategy for their symptoms

• Wider application of a “test and treat” strategy for dyspepsia at the primary care level, which is recommended in recent national guidelines, should be encouraged

## Footnotes

• Contributors: LHE, CJB, CWH, and ACF conceived and drafted the study. LHE, CJB, CWH, and ACF analysed and interpreted the data. ACF drafted the manuscript. LHE and CJB contributed equally to the manuscript and are joint first authors. All authors have approved the final draft of the manuscript. ACF is guarantor. ACF accepts full responsibility for the work and the conduct of the study, had access to the data, and controlled the decision to publish. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

• Funding: No funding given.

• Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

• Ethical approval: Ethical approval for this evidence synthesis was not required.

• Data sharing: No additional data available.

• The lead author (ACF) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as originally planned (and, if relevant, registered) have been explained.

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

View Abstract