Minimal access surgery compared with medical management for gastro-oesophageal reflux disease: five year follow-up of a randomised controlled trial (REFLUX)
BMJ 2013; 346 doi: https://doi.org/10.1136/bmj.f1908 (Published 18 April 2013) Cite this as: BMJ 2013;346:f1908All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
Dear sir
It was with great interest that we read the recent article in BMJ (Grant et al ,2013; 346: april), from the REFLUX trial consortium, about the 5-year follow up results of the pragmatic trial on the long-term management of GORD. As representatives of the LOTUS trial group, we would like to raise a number of issues relevant to the scientific evaluation of these different therapeutic strategies in GORD patients. It might be useful to BMJ readers for us to recap briefly on the main results of the LOTUS trial, which was a strictly controlled comparison of laparoscopic antireflux surgery and optimized PPI therapy. In summary, we found no clinically relevant difference between the two strategies (Galmiche et al, JAMA 2011;305(19):1969-1977).
Firstly, we would like to focus on trial design. We understand that the pragmatic design of the REFLUX trial aimed to be more representative of the general practice situation than more controlled scientific trials conducted in secondary or tertiary referral centers. However, the pragmatic design introduces some potential uncontrolled bias which needs to be fully acknowledged before accepting generalization of the reported findings. For example, in the LOTUS study, the GORD patients were PPI responders with chronic well documented GORD, whilst in the REFLUX trial they were probably a mix of PPI responders and partial or non responders. Was antisecretory drug therapy stopped for a run-in period in the REFLUX study and if so, for how long, before the baseline symptoms were assessed, endoscopic severity was scored and the amount of acid reflux into the oesophagus was measured? This is necessary to confirm the fact that these so called “typical GORD patients” do in fact need potent antisecretory drugs, as data are accumulating that show many do not require such drugs at all or only require them intermittently. This question can to some extent be addressed using a structured and well designed long run-in period before randomization.
When offering medical and surgical therapies for GORD, both therapies have to be realistic and ethically defendable alternatives for all candidate patients. The question then arises as to what to do with those GORD patients who were randomized to surgery but never had surgery. A somewhat (but scientifically most valid) fundamentalist analytical approach in the ITT analysis is to score them as failures for that particular strategy. This can be particularly relevant if the patient actually refused to have the allocated therapy (the specific reasons are not relevant for this kind of analysis). Such an analysis would radically change the outcome profile and open up for a modified conclusion. Since the ultimate goal is to compare treatment strategies, if one therapeutic strategy cannot, for a variety of reasons, be accepted even among those initially opting for it, this is of critical importance to the analysis and needs to be comprehensively discussed.
In the paper you have presented very rough estimates concerning the background characteristics and differences between those who initially preferred surgery or medical therapy, and those who ultimately accepted that treatment at randomization. It is possible that there are substantial differences in personalities and psychological characteristics between these different patient groups that may have a major impact on the clinical and quality of life related outcome variables.
Secondly, we would like to focus on details of the treatments received in the LOTUS and REFLUX trials and how treatment outcomes were measured. In the LOTUS trial the issue of standardization was further strengthened by the acceptance of only one type of fundoplication. This was not a preemptive opinion that a total fundoplication is superior to the posterior partial wrap, but rather an attempt to control as many factors as possible in a complex trial. In some countries and institutions, the partial fundoplication is not considered to offer as good and durable reflux control as the Nissen. How was the quality of surgery documented in your trial? Did you also collect information about the exact procedures and outcome of antireflux surgery from the respective participating centres?
Another important point pertaining to your treatment methodology relates to standardization of medical therapy. Did you set up strict rules for dosing and timing of PPI:s in order to optimize this therapeutic concept and if not, how should this be interpreted with regard to its inferiority, as compared to surgery, detected in the REFLUX trial? As was shown in the REFLUX trial (and previously demonstrated in the SOPRAN study by Lundell et al, Clin Gastroenterol Hepatol 2009;7(12):1292-1298), there is a cumulative increase in the number of surgical cases who, over time, require antisecretory drugs to control reflux symptoms. How were these patients scored in the trial (failures? quality of life?)? The most scientifically correct way is to score them according to the situation at the time of relapse (symptoms, QoL etc) and then allow these values to be carried forward and incorporated in the 5-year follow up data.
Concerning assessment of treatment outcomes, quality of life is now well tested and validated, but for postfundoplication complaints, no single instrument is available that has been scientifically validated. Do you really believe that there is no difference between medical and surgical therapy concerning postfundoplication-related problems? Is this lack of difference just a reflection of the fact that blunt instruments have been used and/or data acquisition was suboptimal? Of course any sort of blinding is impossible to apply in trials like these, but one possibility to avoid bias, is to involve both surgeons and gastroenterologists in the pre- as well as postoperative assessments (e.g. surgeons following medical patients and vice versa). How did you tackle this problem to avoid unbiased assessments?
We are indebted to the REFLUX trial group for their considerable achievement in conducting this study and for having published a paper which will be read with great interest by a large clinical audience. Our plea is that the many methodological difficulties and challenges harboured in the design and conduct of trials like these need to be carefully and critically discussed to facilitate a comprehensive and meaningful interpretation of the results.
On behalf of the LOTUS trial steering committee
Competing interests: No competing interests
Re: Minimal access surgery compared with medical management for gastro-oesophageal reflux disease: five year follow-up of a randomised controlled trial (REFLUX)
Dear Editor
Thank you for the opportunity to respond to Professor Lundell and colleagues from the LOTUS trial group. Your readers may wish to refer to a newly published full report of the REFLUX trial five-year follow-up (Grant AM et al Health Technology Assessment 2013; vol 17; issue 22); this includes a detailed discussion of the findings in the context of the other three comparable randomised trials, the largest of which is the LOTUS trial.
The issues raised by Professor Lundell and colleagues reflect design and related differences between the two trials. The LOTUS trial had an ‘explanatory’ design aiming to “estimate the efficacy of laparoscopic anti-reflux surgery and PPI treatment”; entry criteria (‘PPI responders’) were tightly defined, surgical and medical treatments strictly standardised, primary outcome was a clinical judgement of success or failure, and analysis based on those who received their allocated treatment (the 40 participants who were randomly allocated surgery who did not receive it were excluded). In contrast, the REFLUX trial design was pragmatic, aiming to compare two management policies as used in everyday practice, acknowledging that some participants would inevitably not have their allocated treatments. Eligibility was based on current practice, experienced surgeons chose their favoured approach to fundoplication and specialist gastroenterologists optimised medical management with most subsequent care being in general practice, and the primary outcome was a patient-reported disease-specific quality of life measure. Non-design differences were that the REFLUX trial was coordinated by an accredited trials unit, local recruitment was led by gastroenterologist/gastrointestinal surgeon partnerships, rather than gastrointestinal surgeons alone, and the trial was publically funded rather than by a pharmaceutical company.
REFLUX trial analyses were based on the intention to treat principle (mirroring the comparison of the two policies). These are not biased as Lundell and colleagues seem to imply. However, the percentage of those randomly allocated surgery who actually had surgery was lower than would occur in normal practice, as judged by the parallel, non-randomised preference for surgery group, and secondary analyses adjusting for this showed larger differences favouring surgery. Secondary analyses also showed that those allocated medical management who later had surgery had relatively low REFLUX scores (more severe symptoms) at trial entry, whereas those allocated surgery who did not receive it had relatively high scores (less severe symptoms). This at least partially explains why the differences between the randomised groups, though still statistically significant, narrowed over time. To put this another way, analyses closer to an explanatory approach suggest greater symptomatic relief than were observed in REFLUX’s primary intention to treat analyses.
The LOTUS trial and other studies have demonstrated that some patients do experience post-fundoplication problems, such as dysphagia, flatulence and bloating. In the first year of follow-up in the REFLUX trial, 3 (1%) of those who had surgery subsequently had a dilatation of the wrap. This applied to only one patient in the subsequent four years and based on patients’ reports there were no differences between the trial groups in symptoms related to these problems. Possible explanations for the differences between REFLUX and LOTUS in these respects are that around 50% of those who had surgery in the REFLUX randomised comparison had partial fundoplication (compared with a total procedure for all in the LOTUS trial), and details of the actual procedure were left to the discretion of the surgeon, in contrast to the prescribed approach used in LOTUS.
One of our principal concerns about the LOTUS trial is that its primary outcome, ‘treatment failure’, was likely biased: it was not based on a common definition across the trial but was defined differently for each trial group; for example, it was permissible to double the PPI dose in the medical group without the patient being classified as a treatment failure whereas a ‘need’ for regular medication in the surgical group constituted a treatment failure. In contrast to the LOTUS trial, both these situations were treated as parts of the management policies being compared in the REFLUX trial. Using medication after surgery was considered to be a way of enhancing the benefits of surgery rather than equating to surgical failure. As Lundell and colleagues point out has been shown in the REFLUX and other studies, using medication after surgery is common. However, this inevitably leads to a high ‘failure rate’ after surgery, according to the definition used in LOTUS. From a pragmatic clinical perspective, it seems anomalous to categorise patients whose symptoms are suppressed by medication and who feel well as ‘treatment failures’.
The principal outcomes in the REFLUX trial were all validated disease-specific (REFLUX score – see Macran S et al Qual Life Res2007; 16: 331-43) or generic (SF-36 and EQ-5D) patient reported health–related quality of life measures that could be applied uniformly across the study groups and time points to provide a common currency. All were consistent in suggesting greater benefit from the surgery policy. The LOTUS trial did include a disease-specific quality of life measure (the QOLRAD) as a secondary measure. The five-year results are only reported in an e-table without formal statistical analysis (Galmiche et al JAMA2011; 305(19): 1969-1977 eTable 2). Our reading is that at all time points (1, 3 and 5 years) and for all four dimensions (vitality, food and drink, sleep, and physical/social) scores are statistically significantly higher (indicating greater improvement) in the surgical group. These results are not commented on in the LOTUS report. Yet, if we have interpreted the table correctly, they indicate that analysing LOTUS from more of a patient-centred pragmatic perspective gives results favouring surgery similar to the REFLUX trial.
Yours etc.
Adrian Grant (a.grant@abdn.ac.uk) on behalf of the REFLUX trial group.
Competing interests: No competing interests