CCBYNC Open access
Research

Accuracy of the “traffic light” clinical decision rule for serious bacterial infections in young children with fever: a retrospective cohort study

BMJ 2013; 346 doi: http://dx.doi.org/10.1136/bmj.f866 (Published 13 February 2013) Cite this as: BMJ 2013;346:f866
  1. Sukanya De, PhD student1,
  2. Gabrielle J Williams, clinical researcher1,
  3. Andrew Hayen, associate professor of biostatistics12,
  4. Petra Macaskill, professor of biostatistics1,
  5. Mary McCaskill, medical director3,
  6. David Isaacs, senior staff specialist4,
  7. Jonathan C Craig, senior staff specialist15
  1. 1Screening and Test Evaluation Program, School of Public Health, University of Sydney, Sydney 2006, Australia
  2. 2School of Public Health and Community Medicine, University of New South Wales, Sydney 2052
  3. 3Department of Emergency Medicine, The Children’s Hospital at Westmead, Sydney 2145
  4. 4Department of Infectious Diseases and Microbiology, The Children’s Hospital at Westmead
  5. 5Department of Nephrology, The Children’s Hospital at Westmead
  1. Correspondence to: S De sukanya.de{at}health.nsw.gov.au
  • Accepted 5 February 2013

Abstract

Objectives To determine the accuracy of a clinical decision rule (the traffic light system developed by the National Institute for Health and Clinical Excellence (NICE)) for detecting three common serious bacterial infections (urinary tract infection, pneumonia, and bacteraemia) in young febrile children.

Design Retrospective analysis of data from a two year prospective cohort study

Setting A paediatric emergency department.

Participants 15 781 cases of children under 5 years of age presenting with a febrile illness.

Main outcome measures Clinical features were used to categorise each febrile episodes as low, intermediate, or high probability of serious bacterial infection (green, amber, and red zones of the traffic light system); these results were checked (using standard radiological and microbiological tests) for each of the infections of interest and for any serious bacterial infection.

Results After combination of the intermediate and high risk categories, the NICE traffic light system had a test sensitivity of 85.8% (95% confidence interval 83.6% to 87.7%) and specificity of 28.5% (27.8% to 29.3%) for the detection of any serious bacterial infection. Of the 1140 cases of serious bacterial infection, 157 (13.8%) were test negative (in the green zone), and, of these, 108 (68.8%) were urinary tract infections. Adding urine analysis (leucocyte esterase or nitrite positive), reported in 3653 (23.1%) episodes, to the traffic light system improved the test performance: sensitivity 92.1% (89.3% to 94.1%), specificity 22.3% (20.9% to 23.8%), and relative positive likelihood ratio 1.10 (1.06 to 1.14).

Conclusion The NICE traffic light system failed to identify a substantial proportion of serious bacterial infections, particularly urinary tract infections. The addition of urine analysis significantly improved test sensitivity, making the traffic light system a more useful triage tool for the detection of serious bacterial infections in young febrile children.

Introduction

Febrile illnesses are one of the most common reasons for young children to present to primary care practitioners and may account for up to a third of presentations to emergency departments.1 2 3 Depending on the setting, about 5–25% of fever episodes in young children are due to serious bacterial infections.4 5 6 7 8 9 10 If not detected and managed in a timely manner, such infections may lead to complications, long term disability, and even death.11 12 Young children and infants with serious bacterial infection may manifest few, if any, localising signs of systemic infection.13 14 15 A key challenge for physicians in the clinical evaluation of febrile young children is being able to correctly triage febrile illnesses, identifying those likely to be due to serious bacterial infections in a timely manner while at the same time avoiding over-investigation and overmedication of children, most of whom will have self limiting viral illnesses.

Several clinical criteria and decision tools have been developed to assist clinicians in identifying which febrile children have a serious illness. Unfortunately these have either not been externally validated in independent datasets, do not perform consistently, have insufficient accuracy, or apply only to a limited age range.8 16 17 18 19 20 Recently the UK National Institute for Health and Clinical Excellence (NICE) published a guideline which provides a traffic light system for the initial assessment and management of young children with fever.21 The NICE traffic light system was designed for young children under 5 years of age and was intended for a range of settings (general practice, paediatric specialists, or remote assistance by health professionals), and is a colour coded checklist of symptoms and signs (see online supplementary table 1 on bmj.com). Children whose clinical features fall within the green zone are considered to be at low risk of serious illness, while those in the amber and red zones are at intermediate and high risk respectively. The NICE guidance recommends that further investigations be directed according to the level of risk (see online supplementary table 2 on bmj.com). Although it has been widely promulgated, the accuracy of this system for the detection of serious bacterial infections has not been validated to date.

The aims of our study were to determine the test performance of the NICE traffic light system for the detection of three of the most common serious bacterial infections in young febrile children—urinary tract infection, pneumonia, and bacteraemia—and to assess whether the addition of urine analysis, a near patient test with good performance characteristics,22 improves the performance of the NICE traffic light system. For this study we used data collected prospectively for the Febrile Evaluation of Children in the Emergency Room (FEVER) study.23 The FEVER study (conducted between July 2004 and June 2006) preceded the NICE fever guideline (published in May 2007).

Methods

Study design and setting

Details of the main FEVER study are reported elsewhere.23 We incorporated the Standard for Reporting of Diagnostic Accuracy (STARD) guidelines for study reporting.24

Recruitment

Consecutive children under 5 years old who presented with a febrile illness to the emergency department of the Children’s Hospital at Westmead between 1 July 2004 and 30 June 2006 were eligible. Febrile illness was defined as any illness that met one or more of the following criteria: a measured axillary temperature ≥38.0°C; parental report of a temperature ≥38.0°C measured at home within the previous 24 hours; parental report that the child “felt hot” in the previous 24 hours; and a presenting problem related to fever (10th revision of the international classification of diseases, Australian modification codes R50, R50.0, R50.1. R50.9 and R 56.0), as determined by a triage nurse.

The unit of analysis was an instance of febrile illness. In the case of multiple presentations with the same illness, history and clinical evaluation data from the first visit only were used. Case definition for “same illness” was if the child presented within 24 hours of a previous visit or if the fever had persisted between visits to the emergency department without a fever-free period of at least 24 hours.

Study exclusion criteria

Children transferred from another hospital, those with malignancy, and transplant recipients were excluded.

Data collection

Each child was triaged at presentation to the emergency department using the Australasian triage scale. This scale consists of five categories based on the level of clinical urgency, with cases assigned to category one being the most urgent and cases assigned to category five being the least urgent.25 Clinical information for each illness episode was recorded by the examining physician (emergency department physician, paediatric trainee, or emergency medicine trainee) into a mandatory template within the hospital’s electronic record keeping system. This ensured a standardised assessment with entry of 40 clinical features (symptoms and signs) and relevant background medical information of the children for each episode of febrile illness. The template was completed after the initial physician assessment but before test results were obtained. Any tests ordered (including urine analysis) were at the discretion of the treating physician. Test results, emergency department diagnoses, and admission details were electronically linked to the research database.

Outcome definitions

The primary outcomes of interest were the three most common serious bacterial infections in young children with fever—urinary tract infection, pneumonia, and bacteraemia. There were too few cases of other forms of serious bacterial infection, such as meningitis, osteomyelitis, and septic arthritis, to provide robust estimates of the performance of the traffic light system for these infections separately. We included these illnesses in a fourth category, “any serious bacterial infection,” in combination with the three common infections. In cases where multiple infections clustered together—for example, when a child developed concomitant urinary tract infection and bacteraemia—they were included in all relevant outcomes. Reference standard test criteria for the diagnosis of serious bacterial infection included culture positivity (for urinary tract infection and bacteraemia) or radiological criteria (for pneumonia) and are detailed elsewhere.23

Follow-up

All eligible children were followed up until they fulfilled the case definition for serious bacterial infection or until the fever had resolved for over 24 hours. Follow-up was undertaken at 10–14 days after the emergency department visit. Hospital records were reviewed, and parental reports (via telephone call follow-up) of resolution of fever, antibiotic use, and attendance at other healthcare facilities were obtained. Copies of test reports and chest x ray films from other healthcare facilities were obtained with parental consent.

Applying the NICE traffic light system

We matched the items comprising the NICE traffic light system to equivalent items within the FEVER study febrile template (see supplementary table 3 on bmj.com), which allowed us to apply the system retrospectively to each febrile episode in the FEVER study cohort. Accordingly, each episode was categorised as having a low, intermediate, or high probability of serious bacterial infection based on whether their clinical features fitted in the green, amber, or red zones of the NICE traffic light system. This was then compared with the final diagnosis as determined by reference standard test results and follow-up.

Urine analysis

Urine analysis was done at presentation to the emergency department by the attending doctor or nurse in some children. The urine collection method was in accordance with local clinical practice and was not specified for the study. The urine analysis result was reported semi-quantitatively based on the test strip reference chart. For the purpose of our study, the presence of any level of nitrite or leucocyte esterase was considered as test positive.

Statistical analysis

For each of the four infection categories (urinary tract infection, bacteraemia, pneumonia, and any serious bacterial infection), sensitivity and specificity were calculated for two thresholds for test positivity (presence of one or more red zone features versus one or more red or amber zone features). These cut points are displayed graphically with a receiver operating characteristics curve. For the subgroup of children who had a urine analysis reported, we reassessed the test performance of the traffic light system with and without the urine analysis results. We investigated four thresholds for positive urine analysis: leucocyte esterase positive, nitrite positive, either leucocyte esterase or nitrite positive, and both leucocyte esterase and nitrite positive.

We compared the incremental gain in performance of the NICE traffic lights with the combination of the NICE traffic lights and urine analysis (test positive defined as presence of any one or more of the following: amber traffic light features, red traffic light features, leucocyte esterase positive in urine, nitrite positive in urine). Because of the expected trade-off in test sensitivity and specificity when adding a test, we assessed the incremental gain through the relative positive likelihood ratio of the combination of the NICE traffic lights and urine analysis, to the NICE traffic lights on its own.26 27 We obtained 95% confidence intervals for the relative positive likelihood ratios; if the relative positive likelihood ratio is greater than 1 and its confidence interval does not include 1, the combined test has superior performance than the NICE traffic light system on its own.26 An increase in the positive likelihood ratio would result in an increase in the positive predictive value of the test—that is, the probability of infection given a positive test result.

Results

During the study period there were 19 889 visits by febrile children under 5 years of age. Patient flow in the FEVER study is detailed in figure 1. Overall, 15 781 eligible febrile illnesses were included in our analysis, 1120 (7.1%) due to serious bacterial infection, with 1166 infections identified; urinary tract infection in 543 (3.4% of the 15 781), pneumonia in 533 (3.4%), bacteraemia in 64 (0.4%), osteomyelitis in 12 (0.08%), meningitis in 8 (0.05%), and septic arthritis in 6 (0.04%). Multiple infections per illness occurred uncommonly, with 44 of the 1120 (3.9%) due to two or more serious bacterial infections (two infections in 42 illnesses, and three infections in two illnesses).

Characteristics of included children and illnesses (table 1)

Over a quarter of the illness episodes were in children under 1 year old, and almost half were in children aged between 1 and 3 years. Hospitalisation was required in 1912 (12.1%) fever episodes. Four hundred (82.8%) of the 483 febrile illnesses that received a triage category of one or two were in the NICE red zone. Among those receiving a triage category of three, 2371/4894 (48.4%) were in the red zone, while 1042/6936 (15.0%) with a triage category of four and 253/3468 (7.3%) with a triage category of five were in the red zone. Viral infection, viral upper respiratory tract infection, and gastroenteritis were the most common clinical diagnoses.

Table 1

 Characteristics of the study population

View this table:

A dipstick urine analysis was performed in 3653 (23.1%) of the 15781 febrile illnesses. Urine collection method was recorded for only 2262 samples sent for urine culture. Of these 2262 samples, the in-out catheter method was used in 1022 (45.2%), a clean catch specimen in 691 (30.5%), mid-stream sample in 519 (22.9%), supra-pubic aspirate in 10 (0.4%), and a bag specimen in 20 (0.9%). The proportion of children who had urine analysis is detailed by patient characteristics in supplementary table 4 on bmj.com. Children who were more unwell (triage categories one and two), who required hospitalisation, or had higher temperatures at presentation were more likely to have a urine analysis. A greater proportion of children with a provisional diagnosis of fever with no focus had a urine analysis done than children with a provisional diagnosis of asthma, croup, bronchiolitis, pneumonia, or a focal bacterial infection.

Concordance between NICE traffic light system and the FEVER data fields

Nearly three quarters of the 43 items listed in the traffic light system had comparable fields in the FEVER study. Thirteen (30%) of the 43 symptoms and signs had an exact match in FEVER, while 19 (44%) symptoms and signs were captured by similarly phrased terminology in the FEVER dataset (supplementary table 3 on bmj.com). Eleven (26%) of the 43 symptoms and signs listed in the traffic light system did not have an equivalent field in the fever dataset. These mostly included items such as “a new lump,” “swelling of a limb or joint,” “non-weight bearing limb,” and “bile stained vomit.” Given that these are relatively specific clinical indicators for hernia, focal bone or joint pathology, and bowel obstruction respectively, it is unlikely that the absence of data relating to these would alter the performance of the traffic light system for the detection of urinary tract infections, bacteraemia, or pneumonia, which are the outcomes of interest for this study.

Test performance of NICE traffic light system for detection of bacteraemia, urinary tract infection, and pneumonia

Combining the intermediate and high risk traffic light categories (presence of features that were in the amber or red zones) gave a test sensitivity of 85.8% (95% confidence interval 83.6% to 87.7%) and specificity of 28.5% (27.8% to 29.3%) for the detection of any serious bacterial infection. The high risk category (presence of one or more feature in the red zone) had a sensitivity of 47.9% (44.9% to 50.8%) and specificity of 75.9% (75.2% to 76.6%). Among the serious bacterial infections, 40/533 (7.5%) cases of pneumonia, 9/64 (14.1%) cases of bacteraemia, and 108/543 (19.9%) cases of urinary tract infections were in the green zone and would be missed using the NICE traffic light system.

Urine analysis had been reported in 3653 (23.1%) of the 15 781 febrile episodes. In this subset, 492 febrile episodes had a final diagnosis of one or more serious bacterial infection (362 urinary tract infections, 118 cases of pneumonia, and 27 cases of bacteraemia). Threshold specific sensitivities and specificities for the combination of the traffic light system with urine analysis are detailed in table 2. For the detection of urine tract infections, the combined test had improved performance over the NICE traffic light system on its own (fig 2), with a relative positive likelihood ratio of 1.17 (95% confidence interval 1.12 to 1.23). For the detection of any serious bacterial infection, the relative positive likelihood ratio was 1.10 (1.06 to 1.14), also indicating improved performance of the combined test over the NICE traffic light system.

Table 2

 Performance of the NICE traffic light system with and without urine analysis for detection of serious bacterial infections. Values are percentages (95% confidence intervals)

View this table:
Figure2

Fig 2 Receiver operating characteristics curve for performance of the NICE traffic light system, with or without the addition of urine analysis, for detection of serious bacterial infections

Discussion

We found that the NICE traffic light system has moderate sensitivity but low specificity for detection of the three most common serious bacterial infections in febrile young children, namely bacteraemia, urinary tract infections, and pneumonia. Importantly the traffic light system missed a sizeable proportion of urinary tract infections. This is a substantial deficiency in a screening tool for serious bacterial infections in febrile young children, given that, after the introduction of pneumococcal conjugate vaccine, the prevalence of occult bacteraemia in febrile children presenting to emergency departments has decreased substantially to somewhere between 0.4% and 0.7%,28 whereas the prevalence of urinary tract infection in children with fever without a clinically obvious source remains greater than 7%.29 Adding urine analysis, a simple and inexpensive near-patient test, to the traffic light system led to an appreciable increase in the proportion of urinary tract infections detected, and a concomitant increase in the overall proportion of serious bacterial infections detected.

The NICE fever guidelines advise routine testing of urine in all children with fever without apparent source (including those who are in the green zone of the traffic light system), thus helping to avoid missed cases of urinary tract infections, but they do not include this test in the traffic light system itself.21 30 Based on our findings, we strongly support this recommendation but suggest that urine analysis be added to the traffic light criteria. We recommend that urine analysis should be done routinely in children with fever and suspected bacterial infection and only children with a negative result should be classified as belonging to the green (low infection risk) zone. However, although such an addition would improve the discriminatory ability of the NICE traffic light system, clinicians who use this approach should be aware of the appreciable number of misclassified children that remain. In our subset that had urine analysis (n=3653), the traffic light system with urine analysis included had a 78.0% (2466/3161) false positive rate (over three quarters of illness episodes that were not due to a serious bacterial infection were test positive) and 8.1% false negative rate (40/492 of illness episodes due to serious bacterial infections were test negative). Given its low test specificity, clinicians must still judge each case based on its individual features to avoid over-investigation and overtreatment.

Comparison with other studies

To our knowledge, only one other study has evaluated the test performance of the NICE traffic light system. This was done in 700 children aged between 3 months and 16 years with suspected acute infection.31 In that study the traffic light system had a reported test sensitivity of 85% and specificity of 29% for detection of illnesses (bacterial or non-bacterial in origin) that were considered to be serious or of intermediate severity (uncomplicated urinary tract infections were classified as being mild in severity). However, the number of children with an infection that was reported as serious or of intermediate severity was relatively low (313 in total, with only 67 cases of pneumonia with radiological confirmation, five cases of bacteraemia, and 30 urinary tract infections with systemic symptoms), and so test performance metrics were relatively imprecise. The study also included children up to 16 years old (median age 3 years), although the NICE traffic light system was designed for children up to 5 years old. Finally, the validity of these findings is uncertain because the reference standard test for illness severity was the unverified final diagnoses made by the treating physicians and not standard, microbiologically based definitions. The authors of that study acknowledge that they did not have data on all the “red” and “amber” features.

Strengths and limitations of study

Ours is the first study that has evaluated the test accuracy of the NICE traffic light system for the detection of serious bacterial infections in febrile children under 5 years old, for whom this test was intended, but it does have limitations. The data were not collected for this purpose, and so there was incomplete overlap between the clinical criteria in the NICE traffic light system and FEVER data fields. However, most of the clinical criteria required for the traffic light system—especially all the common criteria relevant to the evaluation of possible urinary tract infection, bacteraemia, and pneumonia—were captured by the corresponding FEVER data fields. We believe that the lack of fields in the FEVER data corresponding to NICE fields such as “a new lump,” “swelling of a limb or joint,” “not weight bearing,” and “bile stained vomit” is unlikely to have affected the detection of the serious bacterial infections of interest to any significant extent, although these could well be of relevance when assessing for other serious infections or illnesses such as septic arthritis, osteomyelitis, or bowel obstruction. A second limitation was that the FEVER study had small numbers of serious bacterial infections other than urinary tract infection, bacteraemia, and pneumonia. As such, the validity of the NICE traffic light system for detection of other serious bacterial infections (such as meningitis, osteomyelitis, and septic arthritis) could not be assessed with confidence. However, the small number of cases of these infections in our study reflects the low prevalence of these infections in a setting of a developed country where immunisation is almost universal.

Our evaluation of the test performance of the traffic light system with the addition of urine analysis was potentially limited by the fact that only about a quarter of the FEVER cohort had a urine analysis. The decision to perform a urine analysis was at the treating clinician’s discretion. As expected, the group of children who had urine analysis was at higher risk of urinary tract infection, and were different than those who did not have a urine analysis performed. This reflects standard clinical practice, where it is not routine to subject a young child with an obvious focus of infection as the fever source and who is at low risk of urinary tract infection to a potentially invasive test. Mandating a urine culture in all eligible children was not ethically justifiable. However, this may lead to verification bias, where only those who are positive for the index test (in this case the traffic light system, with or without the urine analysis), have the reference standard test performed (urine culture), leading to overestimates of test sensitivity. Although possible in our study, this bias may have been ameliorated because the traffic light system was not designed when the FEVER study was conducted, and so could not have been used to determine which children should have had a urine culture performed. Further, the design of our study required that all children were followed until the febrile illness resolved (two step verification process), which was achieved in 93% of all illnesses. Finally, even with a potentially optimistic estimate of sensitivity for the traffic light system, our study shows that improvements in performance are required, such as the inclusion of the urine analysis result.

It could be argued that the NICE guidance was designed to detect all serious illnesses, not just infections. The definition of serious illness in the NICE guidance was “an illness with fever that could cause death or disability if there was a delay in diagnosis or treatment.” While this definition would be inclusive of serious bacterial infections, it also implies detection of serious illnesses that are not of bacterial origin such as Kawasaki disease, viral gastroenteritis with moderate or severe dehydration, metabolic conditions presenting with acid-base or electrolyte imbalance, or severe viral lower respiratory tract infections requiring supportive management. We assessed the accuracy of the traffic light system only for detecting serious bacterial infections and thus did not explore the full potential of this clinical tool. We also did not attempt to assess its clinical effectiveness or ease of application, nor clinicians’ willingness to use it.

Conclusions

The NICE traffic light system has moderate test sensitivity but low specificity for the detection of the three most common serious bacterial infections in young febrile children. The system missed a substantial proportion of urinary tract infections, but the addition of urine analysis improved performance significantly, as reflected in the relative positive likelihood ratio. With the addition of this relatively simple, non-invasive, and inexpensive near patient test, the traffic light system may be a useful triage tool for healthcare professionals for the initial evaluation of likelihood of serious bacterial infections in young febrile children. Clinical effectiveness and acceptability of the system for the detection of serious illnesses need to be assessed through randomised control trials.

What is already known on this topic

  • Diagnosis of serious bacterial infections in young children with febrile illnesses is challenging

  • The NICE guidelines on febrile illnesses in children provide a simple clinical tool (a traffic light system) to guide healthcare professionals’ initial assessment of febrile children, but the accuracy of the tool for detecting serious bacterial infections in young children has not been tested

What this study adds

  • The NICE traffic light system has a moderate sensitivity and low specificity for detection of serious bacterial infections, and urinary tract infections in particular tend to go undetected with the system

  • Adding urine analysis to the traffic light system enhances its sensitivity substantially and increases its positive predictive value, making it a potentially useful tool for the initial evaluation of febrile infants and young children.

  • The low test specificity of the traffic light system makes individual case based clinical judgement critical for clinicians to avoid overinvestigation and unnecessary hospitalisation or treatment

Notes

Cite this as: BMJ 2013;346:f866

Footnotes

  • Contributors SD assisted with the study design, compiled the results, and wrote the manuscript. GJW assisted with the study design, obtained ethical permission, took part in data collection, in database design, monitoring, and reporting, and in medical staff training, and reviewed the manuscript. AH undertook statistical analysis, presented the results, and reviewed the manuscript. PM contributed to the statistical analysis design, interpretation of analysis, and manuscript review. DI formulated the disease definitions, was member of the final diagnoses committee, and reviewed the manuscript. MMC facilitated the study in the emergency department, formulated disease definitions, reviewed the febrile template, undertook training and support of the emergency staff, and reviewed the manuscript. JCC undertook the design of the study, presentation of results, and manuscript review. All authors had full access to all data and analysis, JCC, SD, GW, and PM act as the guarantors.

  • Funding This is a sub-study of FEVER, which was funded by the National Health and Medical Research Council of Australia (grant Nos 211205 and 402764). The funding source had no influence on study design; collection, analysis, or interpretation of data; writing the report; or the decision to submit the paper for publication.

  • Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.”.

  • Ethical approval: Approval was obtained through the University of Sydney Human Research Ethics Committee (ID 2405) and the Royal Alexandra Hospital for Children Ethics Committee (ID 99023, 2004/113).

  • Data sharing: No additional data available.

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.

References