Screening programme evaluation applied to airport securityBMJ 2007; 335 doi: https://doi.org/10.1136/bmj.39419.662998.BE (Published 20 December 2007) Cite this as: BMJ 2007;335:1290
- Eleni Linos, doctoral student1,
- Elizabeth Linos, research assistant23,
- Graham Colditz, associate director4
- 1Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA
- 2Department of Economics, Harvard University, Littauer Center, Cambridge, MA, USA
- 3J-Poverty Action Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139-4307, USA
- 4Prevention and Control, Siteman Cancer Center, Washington University School of Medicine, Campus Box 8109, St Louis, MO 63110, USA
- Correspondence to: E Linos
Safety is paramount to travellers. Governments agree, and the airport operator BAA has spent £20m (€28m; $41m) on airport security in the past year alone.1 Add the $15bn that the government of the United States spent between 2001 and 2005 on aviation screening,2 or the estimated $5.6bn that worldwide airport protection costs each year,3 and we reach one conclusion—airport screening is extremely costly. Yet on 30 July 2007, the head of the International Air Transport Association, Giovanni Bisignani, launched a scathing attack on airport security in the United Kingdom: he claimed that the UK’s “unique screening policies inconvenience passengers with no improvement in security.”4
Complaints about the cost of airport security have flooded the news in recent months, but the problem is not new. The UK has seen a 150% increase in airport security costs since the terrorist attacks on 11 September 2001 and even steeper rises since the London bombings on 5 July 2005.5 With such high value attached to airport security, the details of efficacy, precision, and cost effectiveness of screening methods are easy to ignore. Protection at any cost is a reassuring maxim for us jetsetters. But preventing any death—whether from haemorrhagic stroke, malignant melanoma, or diabetic ketoacidosis—is surely an equally noble cause. In most such cases, screening programmes worldwide are closely evaluated and heavily regulated before implementation. Is airport security screening an exception?
The UK National Screening Committee’s remit is to assess screening technologies on the basis of sound scientific evidence and advise on whether they should be implemented, continued, or withdrawn.6 The table⇓ outlines the criteria used to evaluate screening programmes. These criteria include an important and treatable condition, an accurate and acceptable test, and sufficient evidence of benefit of the proposed screening project from randomised trials. To be considered for a screening programme, the condition must be common and of considerable burden to society. Furthermore, a “preclinical” phase must exist, during which the condition can be detected and treated. Cervical cancer is a classic example—although morbidity and mortality are high worldwide, if detected early, premalignant lesions can be cured. The criteria also mandate that a suitable screening test should be simple, safe, and validated. For example, cholesterol monitoring—used to screen for hyperlipidaemia and prevent its complications—fits these requirements. It is acceptable to the population, it has well defined cut-off values, and the benefit of treatment is established, making it an excellent screening test. Yet things are rarely this straightforward, and for most screening programmes we rely on scientific evidence to show efficacy and effectiveness, cost-benefit balance, and acceptability.
Discussion on whether screening programmes should be implemented inevitably centres on at least one of these key criteria. For example, recent debates on cervical screening have focused on the test—namely, the sensitivity and predictive value of testing for human papillomavirus7 or liquid based cytology8 compared with conventional cervical smears. For lung cancer screening the sticking point has been the quality of the evidence showing that computed tomography screening improves overall mortality.91011 A similar debate for prostate specific antigen testing remains unresolved.
We examine whether airport security screening is an acceptable screening programme—is the evidence sufficient to meet the National Screening Committee’s criteria? We then identify points of future research that could encourage a more rigorous evaluation of airline security measures.
The “disease” and its treatment
Presumably, one of the negative outcomes or “diseases” we are trying to prevent is injury to passengers or crew as a result of in-flight terrorist attacks. The time between arriving at the airport and boarding the plane is the latent period during which dangerous objects can be detected and attacks prevented by confiscation, explosive disarmament, or arrest. These are analogous to the condition, preclinical phase, and treatment—so, far so good. But although any potential threat to the safety of passengers is a noteworthy cause worth fighting against, such events are extremely rare.
Since 1969, only 2000 people have died as a result of explosives on planes, yet the US department of homeland security spends more than $500m annually on research and development of programmes to detect explosives at airports.12 Even the devastating 11 September 2001 attacks caused around 3000 deaths, which is similar to the number of deaths attributed to high blood glucose each day13 or the number of children dying of the human immunodeficiency virus every three days worldwide.14 The publicity awarded to such terrorist attacks is so high that the perceived threat is far higher than the numbers suggest. Furthermore, the cost of airport security ($9 per passenger) is 1000 times higher than for railway security ($0.01 per passenger), even though the number of attacks on trains is similar to that in planes.15 This is analogous to committing mammography resources to screening only the left breast, and ignoring the right side, even though cancer can affect both breasts.
The tests and evidence of benefit
We systematically reviewed the literature on airport security screening tools. A systematic search of PubMed, Embase, ISI Web of Science, Lexis, Nexis, JSTOR, and Academic Search Premier (EBSCOhost) found no comprehensive studies that evaluated the effectiveness of x ray screening of passengers or hand luggage, screening with metal detectors, or screening to detect explosives. When research teams requested such information from the US Transportation Security Administration they were told that evaluating new screening programmes might be useful, but it was overshadowed by “time pressures to implement needed security measures quickly.”16 In addition, we noticed that new airport screening protocols were implemented immediately after news reports of terror threats (fig 1)⇓.
The little we do know about airport security screening comes from investigations of the factors that influence the sensitivity of visual screening of x ray images. These studies conclude that sensitivity depends on the screener’s experience, rather than the precision of the machine. Practice improves the screener’s performance, but unfamiliar or rare objects are hard to identify regardless of experience.171819 Mammography radiologists realise this and undergo years of specialised training after medical school.20
Even without clear evidence of the accuracy of testing, the Transportation Security Administration defended its measures by reporting that more than 13 million prohibited items were intercepted in one year.21 Most of these illegal items were lighters. The screening literature shows that length time and lead time bias produce misleading interpretations of screening studies because of earlier detection of more benign cases that would not necessarily become clinically apparent (overdiagnosis). A similar problem arises with the above reasoning—although more than a million knives were seized in 2006, we do not know how many would have led to serious harm.
The absence of scientific evaluations of the screening tools currently in place and the vast amount of money spent by governments worldwide on airport security have led us to muse over current airport security protocols and wonder about their optimal implementation. What is the sensitivity of the screening question, “Did you pack all your bags yourself?” and has anyone ever said no? Can you hide anything in your shoes that you cannot hide in your underwear? What are the ethical implications of preselecting high risk groups? Are new technologies that “see” through clothes acceptable? What hazards should we screen for? Guns and explosives certainly, but what about radioactive materials or infectious pathogens? Concerns about cost effectiveness—including the indirect costs of passengers’ time spent in long queues—will be central to future decisions, but first we need solid evidence of benefit.
If we were to evaluate the effectiveness of airport screening, we would start by assessing the accuracy of current tests for illegal objects in passengers’ luggage. This would yield only preliminary information on screening test performance; we would need to reapply for funding to evaluate the overall benefit of security screening on mortality and calculate the number needed to screen to prevent the death of one traveller.22 After informing the airport managers, gaining approval from research ethics committees and police, and registering our trial with one of the acceptable International Committee of Medical Journal Editors trial registries, we would select passengers at random at the check-in desks and give each traveller a small wrapped package to put in their carry-on bags. (We would do this after they have answered the question about anyone interfering with their luggage.) A total of 600 passengers would be randomised to receive a package, containing a 200 ml bottle of a non-explosive liquid, a knife, or a bag of sand of similar weight (control package) in a 1:1:1 ratio. Investigators and passengers would be blinded to the contents of the package. Our undercover investigators would measure how long it takes to get through security queues and record how many of the tagged customers are stopped and how many get through. A passenger who is stopped and asked to open the wrapped box would be classed as a positive test result, and any unopened boxes would be considered a negative test result. We would use the number of true and false positives and true and false negatives to estimate the sensitivity and specificity of the current screening process and pool the waiting times to estimate an average waiting time for each passenger (fig 2⇓).
We have heard rumours that this sort of thing actually goes on—that agents occasionally carry illicit items through airport screening units to “test” them and identify gaps in security. Perhaps the evidence we are searching for is strong, but secret. And of course rigorous airport screening may have other benefits. It certainly deters the transport of any illicit object, such as less dangerous but equally unwanted plants, animals, or drugs. In addition, in the midst of mounting reports of thwarted terrorist attacks on airports, the process is comforting to frequent flyers and their families. Nevertheless, the absence of publicly available evidence to satisfy even the most basic criteria of a good screening programme concerns us.
Of course, we are not proposing that money spent on unconfirmed but politically comforting efforts to identify and seize water bottles and skin moisturisers should be diverted to research on cancer or malaria vaccines. But what would the National Screening Committee recommend on airport screening? Like mammography in the 1980s, or prostate specific antigen testing and computer tomography for detecting lung cancer more recently, we would like to open airport security screening to public and academic debate. Rigorously evaluating the current system is just the first step to building a future airport security programme that is more user friendly and cost effective, and that ultimately protects passengers from realistic threats.
Thanks to Lorelei Mucci, Monica McGrath, Mike Stoto, and Pat Cox for useful discussions.
Contributors and sources: Eleni L and GC conceived and designed the study. All authors helped collect data and write and edit the manuscript. Eleni L is guarantor. GC has worked extensively on breast and colorectal cancer screening and advises the American Cancer Society on implementation of screening programmes.
Funding: NIH grant R25 CA098566 provided salary support for Eleni L. The funder had no role in study design; in the collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication.
Competing interests: None declared.
Provenance and peer review: Not commissioned; externally peer reviewed.