Clinical score and rapid antigen detection test to guide antibiotic use for sore throats: randomised controlled trial of PRISM (primary care streptococcal management)BMJ 2013; 347 doi: https://doi.org/10.1136/bmj.f5806 (Published 10 October 2013) Cite this as: BMJ 2013;347:f5806
- Paul Little, general practitioner and professor of primary care research1,
- F D Richard Hobbs, professor23,
- Michael Moore, general practitioner and reader in primary care1,
- David Mant, emeritus professor2,
- Ian Williamson, general practitioner and senior lecturer in primary care1,
- Cliodna McNulty, consultant microbiologist4,
- Ying Edith Cheng, study statistician1,
- Geraldine Leydon, social scientist, principal research fellow1,
- Richard McManus, general practitioner and professor of primary care23,
- Joanne Kelly, senior trial manager1,
- Jane Barnett, senior trial manager1,
- Paul Glasziou, professor of evidence based medicine5,
- Mark Mullee, lead study statistician, director research design service1
- on behalf of the PRISM investigators
- 1University of Southampton Medical School, Aldermoore Health Centre, Southampton SO16 5ST, UK
- 2Department of Primary Care Health Services, University of Oxford, Oxford, UK
- 3University of Birmingham, Birmingham, UK
- 4Health Protection Agency-Primary Care Unit, Microbiology Department, Gloucestershire Royal Hospital, Gloucester GL1 3NN, UK
- 5Faculty of Health Science and Medicine, Bond University, Gold Coast, QLD 4229, Australia
- Correspondence to: P Little
- Accepted 30 August 2013
Objective To determine the effect of clinical scores that predict streptococcal infection or rapid streptococcal antigen detection tests compared with delayed antibiotic prescribing.
Design Open adaptive pragmatic parallel group randomised controlled trial.
Setting Primary care in United Kingdom.
Patients Patients aged ≥3 with acute sore throat.
Intervention An internet programme randomised patients to targeted antibiotic use according to: delayed antibiotics (the comparator group for analyses), clinical score, or antigen test used according to clinical score. During the trial a preliminary streptococcal score (score 1, n=1129) was replaced by a more consistent score (score 2, n=631; features: fever during previous 24 hours; purulence; attends rapidly (within three days after onset of symptoms); inflamed tonsils; no cough/coryza (acronym FeverPAIN).
Outcomes Symptom severity reported by patients on a 7 point Likert scale (mean severity of sore throat/difficulty swallowing for days two to four after the consultation (primary outcome)), duration of symptoms, use of antibiotics.
Results For score 1 there were no significant differences between groups. For score 2, symptom severity was documented in 80% (168/207 (81%) in delayed antibiotics group; 168/211 (80%) in clinical score group; 166/213 (78%) in antigen test group). Reported severity of symptoms was lower in the clinical score group (−0.33, 95% confidence interval −0.64 to −0.02; P=0.04), equivalent to one in three rating sore throat a slight versus moderate problem, with a similar reduction for the antigen test group (−0.30, −0.61 to −0.00; P=0.05). Symptoms rated moderately bad or worse resolved significantly faster in the clinical score group (hazard ratio 1.30, 95% confidence interval 1.03 to 1.63) but not the antigen test group (1.11, 0.88 to 1.40). In the delayed antibiotics group, 75/164 (46%) used antibiotics. Use of antibiotics in the clinical score group (60/161) was 29% lower (adjusted risk ratio 0.71, 95% confidence interval 0.50 to 0.95; P=0.02) and in the antigen test group (58/164) was 27% lower (0.73, 0.52 to 0.98; P=0.03). There were no significant differences in complications or reconsultations.
Conclusion Targeted use of antibiotics for acute sore throat with a clinical score improves reported symptoms and reduces antibiotic use. Antigen tests used according to a clinical score provide similar benefits but with no clear advantages over a clinical score alone.
Trial registration ISRCTN32027234
Most patients presenting with acute sore throat still receive antibiotics,1 despite a Cochrane review documenting only modest symptomatic benefit.2 The review also documents that though antibiotics probably prevent complications, they are rare.2 This is supported by recent ecological data3 and routine datasets,1 4 which confirm that complications are not common in routine practice.
Sore throat is one of the respiratory infections for which there are several reasonable diagnostic strategies for targeting antibiotics: rapid streptococcal antigen detection tests (RADTs) are one of the commonest near patient tests in clinical use internationally, and clinical scores to predict streptococcal infection are also widely advocated and used either alone or in combination with the antigen test.5 6 7 8 Use of clinical scores such as the Centor criteria (which were designed to predict the presence of Lancefield group A β-haemolytic streptococci) or antigen tests have the potential to better target antibiotics, prevent progression of the illness and complications, improve symptom control, and reduce overall antibiotic use compared with empirical management strategies such as delayed prescribing or no offer of antibiotics.9 There is, however, a paucity of evidence for clinical scores for most of these outcomes: one Canadian trial suggested that rapid antigen tests but not the Centor criteria modified antibiotic prescribing, but the trial was small and did not report on important patient outcomes such as symptom control or progression of illness.10 Further evidence is needed to confirm whether the use of rapid antigen tests or clinical scores can modify antibiotic use and patient outcomes.
We previously performed in vitro and diagnostic phases of this project to provide evidence for choosing a valid and widely available rapid antigen test.11 We showed that Lancefield groups C and G streptococci were presenting in a similar manner to group A streptococci and developed a clinical score to predict the presence of Lancefield Group A, C, and G streptococci.
We compared three strategies for limiting or targeting antibiotic use in patients with sore throat: delayed antibiotic prescribing, the use of a clinical score designed to identify streptococcal infection, and the targeted use of rapid antigen tests according to the clinical score. The trial was also adapted after agreement from the funders and ethics committee. At the start of the trial we used score 1 (n=1129). During the trial, however, when a more consistent score became available based on separate diagnostic studies, we used the second score (score 2 (acronym FeverPAIN), n=631). We have presented the results for score 2 here, with the results for score 1 in appendix 1. We did not use analysis of the results for score 1 in making the decision to adapt the trial.
Rationale for changing the clinical scores during the trial: separate diagnostic studies
We used two diagnostic studies in patients not involved in the trial (combined n=1107) to develop the clinical scores to predict streptococcal infection with Lancefield C, G, and A groups. The initial plan was to follow a traditional “sequential” approach of developing a score in one cohort, then validating it a second cohort to get round the problem of overfitting. The first part of the trial used a clinical score developed from the first diagnostic study (score 1; area under the receiver operating characteristics curve (AUC) of 0.76). The second diagnostic study, however, showed considerably reduced discrimination for score 1 (AUC 0.65) because of inconsistent performance of the variables making up score 1. We therefore used a modified approach to generate a second score (score 2; acronym FeverPAIN), using data from both diagnostic studies and corrected for the problem of overfitting by using bootstrapping techniques. Variables were included in the second score only if they were significant in univariate analysis of both diagnostic studies and in the multivariate analysis of at least one of the two diagnostic studies.
Score 2 had moderate discrimination in both cohorts (AUC first cohort 0.74, second 0.71), better than the Centor criteria (0.72 and 0.65, respectively). Unlike the Centor criteria, score 2 (FeverPAIN) performed well in identifying a substantial number of participants at low risk of streptococcal infection. Although the second score had internal validation with bootstrapping of the estimates, ideally further external validation is needed. The features of score 2 were: fever during previous 24 hours; purulence; attend rapidly (within three days); inflamed tonsils; no cough/coryza (acronym FeverPAIN)) (see appendix 1 for further details).
Health professionals, mainly general practitioners but also triage practice nurses, recruited patients presenting with acute sore throat in general practices in south and central England.
Included patients were people aged ≥3 presenting with acute sore throat (two weeks or less of sore throat) and an abnormal looking throat—that is, erythema and/or pus—as in our previous studies in primary care12). Exclusion criteria were non-infective causes of sore throat (such as aphthous ulceration, candida, drugs) and inability of patient or parent/guardian to consent (such as dementia, uncontrolled psychosis).
Baseline clinical measure
The recruiting health professional completed clinical details at baseline:
Temperature (using Tempadot thermometers)
The presence and severity of baseline symptoms (sore throat, difficulty swallowing, fever during the illness, runny nose, cough, feeling unwell, diarrhoea, vomiting, headache, muscles ache, abdominal pain, sleep disturbance, interference with normal activities) on 4 point Likert scales (none, a slight problem, a moderately bad problem, a bad problem), and the presence of signs (pus, nodes, tender nodes, raised temperature).13 14 15 16
Clinicians were asked to complete non-recruitment logs, but, because of time pressures in acute clinics, there was poor compliance. Clinicians were also asked to document the commonest reasons why patients were not approached and why they declined in an end of study questionnaire.
After the baseline assessment, patients were individually randomised with a web based computer randomisation service to one of three groups (see below). Randomisation used permuted block sizes of 3, 6, 9, and 12, which were also randomly chosen. Originally the protocol specified stratification by clinician’s belief in the likelihood of bacterial infection, but after discussion with the funder this was omitted.
All patients were advised to use regular analgesia (paracetamol or ibuprofen, or both).
Delayed antibiotics (control)
A prescription was prepared and left in reception, with advice to the patient to collect the prescription after three to five days if symptoms were not starting to settle or were getting considerably worse.17 This strategy previously resulted in similar rates of antibiotic use and beliefs compared with no offer of antibiotics and reduced reconsultation more effectively.18 19 It has been incorporated into routine practice in the United Kingdom with no increase in complications of sore throat.20
The clinical score (FeverPAIN) was applied, and antibiotics were not offered to those with low scores (0/1). Immediate antibiotics were offered for those with high scores (≥4, an estimated 63% streptococci based on the diagnostic studies) and delayed antibiotics for those with intermediate scores (2 or 3, 39% streptococci).
Rapid antigen detection test group
The clinical score was used in all patients randomised to the rapid antigen test group. Those with low clinical scores (0/1) were not offered antibiotics or a rapid antigen test (<20% streptococci), those with a score of 2 (33% streptococci) were offered a delayed prescription, and those with higher scores (≥3, 55% streptococci) underwent a rapid antigen test on surgery premises. After the test, patients with negative results were not offered antibiotics. The IMI test pack RADT was used based on in vitro performance and ease of use.
Patients were blind to the precise details of the groups being tested, but the open design made full blinding impossible. The research team who collected data (by phone or notes review) were blind to group as far as possible, but details of patient management were available in the notes. No changes in planned outcome measures were made after the start of the trial.
Patients completed a symptom diary each night until symptoms resolved or up to 14 nights.12 17 Each symptom was scored (0=no problem to 6=as bad as it could be): sore throat, difficulty swallowing, feeling unwell, fevers, sleep disturbance. Patients took their temperature with a disposable thermometer (TempaDot, 3M, Bracknell) as in previous studies.12 21 If a diary was not received by three weeks, a brief questionnaire was sent to document key outcomes, and then a telephone call if the brief questionnaire was not received.
Primary outcome: symptom severity—This was the mean score of sore throat and difficulty swallowing for the two to four days after the consultation, when patients rate their sore throat at its worst, and is internally reliable (Cronbach’s α=0.92).
Duration of illness—Before analysis the trial management team agreed that illness rated moderately bad or worse19 was more important in decision making for both patients and clinicians than the duration of milder symptoms until complete resolution.
Antibiotic use—Patients reported antibiotic use, which agrees well with the documented collection of delayed prescriptions.3
Side effects—Diarrhoea and skin rash were documented in the diary and from review of the notes.
Medicalising beliefs—Patients’ belief in the importance of seeing the doctor for future episodes was recorded on reliable Likert scales.17
Notes were reviewed to document subsequent episodes of infection, time to return for these episodes, complications, and economic data.12 The available follow-up time varied from one1 month to two years.
Sample size calculations—We used the NQUERY multiple group sample size programme for three groups. For a 0.33 standardised effect size between the rapid antigen test group and the other groups (assuming both control groups are 0.33 SD higher than the rapid antigen test group), we estimated that we needed a minimum of 134 per group (for α=0.05, β=0.2) or 495 allowing for 20% loss to follow-up (which was the target for the second phase of the trial). For α=0.01 and β=0.1, we would need 242 per group, or 909 patients in total, allowing for 20% loss to follow-up of diary information.17 19 The SD of 0.33 is equivalent to about half of the patients rating sore throat a mild rather than a moderately bad problem.19 (See appendix 2 for sample size calculations for other outcomes).
The trial management team (blind to study group) finalised the analysis plan before performing the analysis. Analysis of covariance was performed for the severity scores and Cox regression for the duration of symptoms rated moderately bad or worse. Proportional hazards assumptions for Cox regression were checked graphically and deemed appropriate. Logistic regression was used for dichotomised outcomes (such as beliefs and return to the surgery, adjusted for follow-up time). Odds ratios were converted to risk ratios.22 The models controlled for baseline severity (a strong predictor of outcome) and potential confounders (in this case fever during the past 24 hours). Intention to treat analysis was based on complete datasets, given the problem of imputing modest differences for rapidly changing symptomatic outcomes. Although a per protocol analysis was initially considered, given the pragmatic nature of the study it was difficult to operationalise what per protocol might mean, so no per protocol analysis was performed. Secondary analyses were also performed at the suggestion of the referees, with adjustment for practice as a covariate (that is, adjusting for confounding by practice) and also with practice as a cluster variable (that is, adjusting for clustering by practice). Selection bias was assessed by comparing clinical features to the previous diagnostic study and with a parallel observational cohort with the same clinical proforma (the MRC DESCARTE study, which recruited more than 10 000 patients). No interim analysis was performed, no subgroups were specified in advance. The study team agreed in advance that if there were significant differences between the two scores (based on interaction terms in the models), we would present the score 2 results separately as the main results, with score 1 results documented in an appendix.
Patients presenting in primary care were recruited from 23 October 2008 until 18 April 2011 from 48 practices overall, of which 46 practices recruited for the first part of the trial and 21 practices recruited for the second part of the trial (figure⇓ for score 2 and appendix 1 for score 1). These practices recruited 1760 patients: 1129 in the first part of the trial and 631 in the second part (which reached the minimum sample size for the primary outcome but not the intended higher sample size). As there was clear evidence of differential effectiveness, with score 2 performing better than score 1 (see appendix 1 for score 1 results), we have presented the results from the second part of the trial only.
Most baseline characteristics of the groups were similar (table 1⇓) except that fever reported in the past 24 hours was more common in the clinical score group. As fever modestly changed the estimates, all results control for fever in addition to baseline severity of sore throat and difficulty swallowing. Female patients were slightly less common in the clinical score group, but inclusion of sex in the model made no difference to the estimates and so results are presented without adjustment for sex.
Primary outcome: symptom severity (mean score of soreness and difficulty swallowing in days 2-4)
Compared with the control group, there were greater improvements in symptom severity for both the clinical score group (−0.33, 95% confidence interval −0.64 to −0.02) and the rapid antigen test group (−0.30, −0.61 to 0.004)—that is, about one third of a point, equivalent to one person in three rating sore throat and difficulty swallowing a slight rather than a moderately bad problem (table 2⇓).
Duration of moderately bad symptoms
In the delayed prescribing (control) group, symptoms rated moderately bad or worse lasted a median of 5.0 days. Compared with the delayed antibiotics group, symptom resolution was significantly faster in the clinical score group (hazard ratio 1.30, 95% confidence interval 1.03 to 1.63), equivalent to saving a day of moderately bad symptoms. Resolution was faster in the antigen test group but not significantly (table 2⇑).
Use of antibiotics
Of the patients in the delayed prescribing group, 46% (74/164) reported using antibiotics. The other two groups had a lower use of antibiotics: compared with the delayed prescribing group there was an estimated 29% relative reduction in the clinical score group (risk ratio 0.71, 0.50 to 0.95) and a 27% relative reduction in the antigen test group (0.73, 0.52 to 0.98; table 2⇑).
Belief in need to see doctor in future
There was a trivial difference in belief in the need to see a doctor, treated either as a continuous variable or dichotomised (table 2⇑).
Return to the surgery
There were no significant differences in return to the surgery during the following month or the subsequent follow-up.
There were no suppurative complications (otitis media, sinusitis, quinsy, or cellulitis) in either phase of the trial. Fewer than 1% of patients returned with either skin rash or diarrhoea within a month of the index consultation in any group for score 2 (delayed prescribing 0/207, clinical score 2/210, antigen test 1/211), with similar findings for score 1 (delayed prescribing 5/374; clinical score 0/380; antigen test 1/359).
Compliance with prescribing strategy
Table 1 also shows the prescribing strategy used at the baseline consultation and shows that groups were well differentiated⇑. As this was a pragmatic trial clinicians were asked to use the intended strategy when this could be agreed with the patient but were given flexibility to negotiate other strategies, as would happen in practice. The intended strategy occurred in 83% of consultations (520/629): 79% (162/205) in the delayed group, 85% (179/211) in the clinical score group, and 179/213 (84%) in the antigen test group. Compliance with the intended strategies was also good for score 1 (see appendix 1, table B).
Selection and attrition bias, practice effects
There was no evidence of clinical selection bias when we compared the patients in the two parts of the trial (appendix 3). Although the trial patients presented with slightly fewer streptococcal features compared with observational cohorts, when we selected practices recruiting patients with higher streptococcal scores the estimates of effect were larger, which suggests the trial results are possibly conservative.
Adjustment for practice as a covariate (that is, for potential confounding) provided similar estimates and inferences, whereas adjustment for clustering by practice resulted in slightly different inferences (table 3⇓).
Our results suggests that across a range of practitioners and practices, use of either a simple clinical score or a clinical score with a rapid antigen test is likely to moderately improve symptom control and reduce antibiotic use. Use of the clinical score combined with targeted use of a rapid antigen test provided similar benefits but with no clear advantages compared with use of a clinical score alone.
Main results in context of previous literature
Although the effect on symptom severity in the antigen test group did not quite reach the minimum clinical difference specified in advance nor statistical significance, the effect in both antigen test and clinical score groups was similar—to make a difference of one person in three rating sore throat a slight problem rather than a moderately bad problem. Both interventions also reduced antibiotic use. Compared with use of the clinical score alone, however, there was no evidence either for symptom management or antibiotic use to justify the increased time (five minutes) and costs of using rapid antigen tests. The limited additional value of a rapid antigen test might be because the diagnostic advantage in such tests in identifying group A streptococci is in part matched by the disadvantage of not identifying group C and G streptococci, which provide similar symptom burdens to group A organisms. The previous small trial of rapid antigen tests10 showed that using the Centor score23 24 with or without antigen tests on its own did not modify antibiotic use, but symptomatic outcomes were not reported. The difference between these trials could be that in our trial we used a more reliable score: we have shown that individual items from the Centor score and score 1 did not perform optimally in our two previous diagnostic cohorts in terms of identifying patients with low likelihood of streptococcal infection. It is possible that a more liberal use of rapid antigen tests (for example, for patients with a FeverPAIN score of 2 or more rather than ≥3) would result in less use of antibiotics, but our initial health economic modelling suggested that using rapid antigen tests for those with lower risks of streptococcal infection would be more inefficient, and interviews with practitioners suggested that more widespread use of near patient tests would be unacceptable. It is unclear why symptom management should be significantly better with score 2 than score 1. Possibly the particular combinations of more florid symptoms in score 2 (such as fever and pus, which are not in score 1) are more important in determining symptom burden, and/or better in determining symptom response to antibiotics because of microbiological or patient factors (such as the differential nature of organisms on the surface and in the crypts of the tonsils or the relation between symptoms, the immune response, and prognosis25 26). Possibly score 2 has greater clinical face validity for clinicians or patients, which could perhaps facilitate stronger advocacy and potentially better adherence to the proposed prescribing strategy, although our relatively crude data concerning what health professionals did (but acknowledging that we do not know how they did it) suggest health professionals complied reasonably well with the proposed prescribing strategies.
The rate of antibiotic use reported in the current trial with delayed prescribing (>40%) was significantly higher than our previous research,5 17 but the number of features associated with streptococcal infection was also higher (for example, 15% had purulent tonsils in the previous trial compared with 26% in this trial), so it could be that more patients with milder sore throats are now self caring rather than consulting their general practitioner compared with 15 years ago. The previous trial also recruited predominantly in deprived inner city settings, whereas the current trial had a wider range of practices.
We could not show any difference in belief in the need to see a doctor or reconsultations either within a month or with longer follow-up—that is, no apparent “medicalising” effect of the rapid antigen test strategy. This compares with a dramatic medicalising effect of prescribing antibiotics.12 17 The lack of an effect with rapid antigen testing in a trial setting where such tests are not used routinely, however, might mean that it is difficult to show medicalisation in the short term. Longer term implementation studies or possibly international comparison studies might be needed.
Strengths and weaknesses
To our knowledge this is the first randomised trial to assess the impact of rapid antigen detection tests and clinical scoring methods on both symptom control and antibiotic use for acute sore throat, one of the commonest respiratory infections managed in clinical practice. The FeverPAIN phase of the trial provided limited power to assess dichotomous outcomes but had adequate power for symptomatic outcomes. Groups were slightly unbalanced for fever, but we documented estimates adjusted for fever, and, in other respects, trial groups were similar. Symptomatic outcomes and antibiotic use changed significantly in the expected direction for both intervention groups, suggesting chance is an unlikely explanation. The trial was designed and analysed as an individually randomised trial, but practice variation probably makes a modest difference: adjustment for practice as a covariate did not alter the inferences, and adjustment for clustering by practice led to slightly reduced significance for symptom severity in the clinical score group but slightly increased significance for symptom resolution. Information about non-recruitment was poor, as would be expected for a trial recruiting acutely unwell patients during clinics at the busiest times of year. Streptococcal scores were slightly lower in the trial compared with previous observational studies, but differences were modest (15% lower), and exclusion of practices with low scores increased the estimates, which suggests the trial results are conservative. This was a pragmatic trial so clinicians could negotiate management, as happens in everyday practice. Nevertheless compliance with the intended intervention was good, and the results for score 2 cannot be explained by greater compliance as compliance for score 1 was as good. Clinicians used management prompts for the clinical score based on history and examination, so it is unclear whether clinicians not using such prompts would achieve similar results.
Implications for practice and future research
Clinicians can consider using a clinical score to target antibiotic use for acute sore throat, which is likely to reduce antibiotic use and improve symptom control. There is no clear advantage in additional use of a rapid antigen detection test. As two examination components are required for the clinical score the validity for telephone triage is unclear. Although previous duration of illness (rapid attendance in three days or less) could reflect health system factors, the same variable was equally important in a different health system with different expectations.27 As rapid attendance probably reflects the speed of progression of symptoms, it would also be helpful to compare other methods of operationalising this variable in other settings.
What is already known on this topic
Antibiotics are still prescribed for most patients attending primary care with acute sore throat
Rapid antigen detection tests and clinical scores are commonly used to target antibiotic use, but there is little robust trial evidence to support their use
What this study adds
Compared with empirical delayed antibiotic prescribing for acute sore throat, use of a clinical score improves both reported symptoms and antibiotic use
Use of the clinical score combined with targeted use of a rapid antigen test provides similar benefits but with no clear advantages compared with use of a clinical score alone
Cite this as: BMJ 2013;347:f5806
University of Southampton: Paul Little, Ian Williamson, Mike Moore, Mark Mullee, Man Ying Edith Cheng, James Raftery, David Turner, Jo Kelly, Jane Barnett, Karen Middleton, Lisa McDermott, Geraldine Leydon, Beth Stuart; University of Oxford: Richard Hobbs, Richard McManus, David Mant, Paul Glasziou, Sue Smith, Diane Coulson University of Birmingham: Razia Meer-Baloch; Health Protection Agency: Cliodna McNulty, Peter Hawtin.
We are grateful to all the patients and healthcare professionals who have contributed their time and effort and helpful insights to make PRISM possible.
Contributors: Razia Meer-Baloch (senior trial manager), day to day coordination of the Birmingham study centre, and commented on drafts of the paper. YEC developed the protocol, contributed to quantitative analysis, and drafting of the paper. PG developed the protocol for funding, contributed to management of the study, commented on the paper. FDRH developed the protocol for funding, contributed to management of all studies, supervised the Birmingham study centre, and contributed to the drafting of the paper. JK and JB developed the protocol, provided day to day overall management of the study, coordinated recruitment in the lead study centre and coordination of other centres, and commented on drafts of the paper. GL developed the protocol for funding, contributed to management, and commented on drafts of the paper. PL had the original idea for the protocol, led protocol development and the funding application, supervised the running of the lead study centre and coordination of centres, contributed to the analysis, led the drafting of the paper, and is guarantor. RMcM developed the protocol for funding, contributed to management of all studies, supervised the Birmingham Network and contributed to the drafting of the paper. DM (now emeritus professor of general practice) developed the protocol for funding, supervised the running of clinical studies in the Oxford centre and contributed to the analysis and the drafting of the paper. Lisa McDermott (social scientist; research assistant, University of Southampton), developed the protocol, and commented on drafts of the paper. CMcN and Gemma Lasseter developed the protocol and contributed to the management and write up of the study. Peter Hawtin (consultant microbiologist for the HPA) developed the protocol for funding, contributed to the design and running of the in vitro and diagnostic phases of the study. Karen Middleton (data manager, University of Southampton). Provided administrative support, developed data management protocols, coordinated data entry, and commented on drafts of the paper. M Mant developed the protocol for funding, contributed to the management of the study, contributed to the analysis and to the drafting of the paper. M Mullee, developed the protocol for funding, contributed to study management, supervised data management, led the quantitative analysis and contributed to the drafting of the paper. IW developed the protocol for funding, contributed to the management of the study, and drafted of the paper. Sue Smith, Mary Selwood, and Diane Coulson (trial managers, University of Oxford) provided day to day coordination of the Oxford study centre and commented on drafts of the paper. James Raftery, Rafael Pinedo-Villaneuva, and David Turner (Health Economics, University of Southampton) developed the protocol and led protocol development for the health economic analysis.
Funding: This project was funded by the National Institute for Health Research Heath Technology Assessment (HTA) Programme (project number 05/10/01) and will be published in full in the Health Technology Assessment journal. Further information is available at www.nets.nihr.ac.uk/projects/hta/051001. This report presents independent research commissioned by the National Institute for Health Research (NIHR). The views and opinions expressed by authors in this publication are those of the authors and do not necessarily reflect those of the NHS, the NIHR, NETSCC, the HTA programme or the Department of Health.
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.
Ethical approval: The study was approved by a multicentre research ethics committee (number 06/MRE06/17), and all participants gave written informed consent.
Declaration of transparency: PL affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.
Data sharing: We are happy to share data and collaborate with other investigators as appropriate (for example, in larger merged individual patient data studies).
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/.