Distribution of childhood leukaemias and non-Hodgkin's lymphomas near nuclear installations in England and Wales

BMJ 1994; 309 doi: (Published 20 August 1994) Cite this as: BMJ 1994;309:501
  1. J F Bithell,
  2. S J Dutton,
  3. G J Draper,
  4. N M Neary
  1. Department of Statistics, University of Oxford, Oxford OX1 3TG Childhood Cancer Research Group, University of Oxford, Oxford OX2 6HJ
  1. Correspondence to: Dr Bithell.
  • Accepted 8 June 1994


Objective : To examine the relation between the risk of childhood leukaemia and non-Hodgkin's lymphoma and proximity of residence to nuclear installations in England and Wales.

Design : Observed and expected numbers of cases were calculated and analysed by standard methods based on ratios of observed to expected counts and by a new statistical test, the linear risk score test, based on ranks and designed to be sensitive to excess incidence in close proximity to a putative source of risk.

Setting : Electoral wards within 25 km of 23 nuclear installations and six control sites that had been investigated for suitability for generating stations but never used.

Subjects : Children below age 15 in England and Wales, 1966-87. Main outcome measure - Registration of any leukaemia or non-Hodgkin's lymphoma.

Results : In none of the 25 km circles around the installations was the incidence ratio significantly greater than 1.0. The only significant results for the linear risk score test were for Sellafield (P=0.00002) and Burghfield (P=0.031). The circles for Aldermaston and Burghfield overlap; the incidence ratio was 1.10 in each. One of the control sites gave a significant linear risk score test result (P=0.020). All the tests carried out were one sided with P values estimated by simulation.

Conclusion : There is no evidence of a general increase of childhood leukaemia or non-Hodgkin's lymphoma around nuclear installations. Apart from Sellafield, the evidence for distance related risk is very weak.


The apparent excess of cases of childhood leukaemia in the village of Seascale near the nuclear reprocessing plant at Sellafield has been extensively investigated after the comprehensive report by the advisory group chaired by Sir Douglas Black.1 Although Sellafield is unique in the United Kingdom in respect of its reprocessing function and the level of discharges, public concern has prompted the investigation of the areas near other nuclear installations (including Dounreay,2 Aldermaston and Burghfield,*RF 3-5* Hinkley Point,6 and Winfrith7).

In an attempt to avoid the problems of analyses determined after the event various workers have systematically reviewed rates of leukaemia around all nuclear installations. In particular, Baron studied temporal changes in mortality in regions around such installations,8 while Cook-Mozaffari et al9, 10 and Forman et al11 have used various methods to study mortality and incidence rates in areas of England and Wales in relation to their proximity to nuclear sites. As small area data have only recently become available for childhood leukaemia, these studies used relatively large geographical areas, possibly masking any effect which is small or localised. On the other hand, small area analyses result in low counts of cases, which are hard to interpret by conventional analyses. We dealt with these problems by applying the results of recent methodological developments in the analysis of small area statistics to data from the National Registry of Childhood Tumours maintained by the Childhood Cancer Research Group.

Previous findings have provided no consistent evidence that any increased risk of childhood leukaemia might be concentrated in a particular age group. We therefore decided that our analyses should be based on registrations of all cases of leukaemia and non-Hodgkin's lymphoma in children aged under 15 years. This grouping is justified on the grounds that acute lymphoblastic leukaemia and non-Hodgkin's lymphoma may represent different manifestations of the same disease.


Sites considered

We considered 29 sites in England and Wales. Firstly, we examined eight of the 10 generating stations currently operated by Nuclear Electric plc (see table I). These were all started up before the beginning of our study period except for Wylfa, which started to operate in 1971; one of the eight study children in the vicinity of this installation was registered before this date. Secondly, we considered the seven installations operated by British Nuclear Fuels, the United Kingdom Atomic Energy Authority, or the Ministry of Defence which were operating during the study period (see table II). The third group comprised eight miscellaneous sites excluded from the two groups above on the grounds that emissions are believed to be small or that the operations started too late to affect most of the children in our study. This group includes the two remaining Nuclear Electric sites, Hartlepool and Heysham, which were commissioned in 1983 and 1984, respectively (see table III). Finally, we considered six sites which were investigated for suitability for a nuclear power station but where construction has never taken place (see table IV).

The final group served as a kind of control in view of the possibility that it might be some characteristic of sites for nuclear facilities that is responsible for the risk. Further details of the nature of some of these installations are given in Cook-Mozaffari et al.9, 10

We used regions of 25 km radius, principally because this accorded with previous studies; the size is partly influenced by the consideration of what constitutes a reasonable commuting distance for workers at the plants. The present paper is concerned solely with sites in England and Wales as a parallel study is being undertaken by the Information and Statistics Division of the Scottish Health Service; results relating to Scottish sites will appear in a separate paper.

Registration data

The data set analysed consisted of 11 283 cases of leukaemia and non- Hodgkin's lymphoma registered in children under the age of 15 in England, Wales, or Scotland between 1966 and 1987. These data have been extensively checked for completeness and accuracy, and ascertainment is estimated to have reached 95%. Observed numbers of cases were compared with expectations calculated by using population data for small areas. The details of these calculations are described in a technical report available from the authors, and we give here only a summary of the steps entailed.

Cases were assigned on the basis of their address at registration to areal units consisting of aggregations of 1971/1981 census tracts corresponding as closely as possible to 1981 electoral wards.

The annual numbers of children at risk at ages 0-4, 5-9, and 10-14 for each sex were estimated from population statistics provided by the Office of Population Censuses and Surveys and the General Registry Office (Scotland). From the same census data certain indicators of socioeconomic status for the populations in the wards were extracted.

A Poisson regression model12 was used to relate the observed counts of leukaemias and non-Hodgkin's lymphomas in each five year age group in each of the wards to the numbers of child years of risk, the registrar general's standard regions, and also various combinations of the socioeconomic indicators. As a result of this modelling we decided to use the combination of socioeconomic variables proposed by Townsend et al,13 which satisfactorily represented the dependence of risk on these factors, together with the standard region to calculate expected numbers for each of our 9836 (modified) wards.

The wards whose population centroids were within 25 km of a potential source of risk were then identified, and the distances of these centroids from the source were calculated.

The dates chosen were determined by the availability of the cancer registration data and their relevance to the periods of operation of the installations. As discussed above all the registrations in each 25 km circle occurred after the start of operations at the installations in the first two groups, except for the case near Wylfa already referred to. The effect of ignoring this is only to reduce the power of the test marginally; the result for Wylfa will be seen to be far from significant anyway so we are confident that we have committed no inferential error in this case.

Test statistics considered

When testing for the possibility of an excess risk near a point source S the number of possible tests is large. The most obvious choice is to look first at the incidence ratio in a 25 km circle around S. Malignant disease in children is rare, however, so that the numbers are generally fairly small. We could therefore not necessarily expect any effect of proximity to a nuclear installation to be reflected in a statistically detectable increase in incidence within such a region. For instance, the incidence in the 25 km vicinity of Sellafield is only marginally above that expected. We therefore sought a more sensitive test of spatial relation of cases to S, specifically one based on their distance or its rank. Although this clearly would not be ideal for all hypothetical mechanisms of excess risk, it should have a better chance of detecting a relation between distance and risk than a simple incidence analysis. It was, after all, the extreme proximity of the Seascale cases to Sellafield that attracted public attention.

A considerable amount of theoretical work has been carried out in connection with the current project to determine which test would be most appropriate. In particular, it is known how to obtain the best (most powerful) test for detecting a given pattern of risk. This does not, however, solve the problem as the type of pattern to be detected is in practice unknown.

Our simulation and theoretical work has led us to propose a linear risk score test in which we give to each case a score related to some putative measure of risk; such scores are then summed for all the cases in the region to give a total score. Risk scores considered were the reciprocal of the distance and the reciprocal of the distance rank of the ward under consideration. Thus, with the latter measure, a case in the nearest ward would be scored 1, a case in the second ward would be scored as 0.5, etc. Linear risk score tests are known to be most powerful against certain corresponding patterns of risk.14

We also considered using two other tests, the maximum likelihood ratio and the Poisson maximum tests, proposed by Stone15 and Bithell and Stone,16 which have recently come to be generally advocated. The maximum likelihood test ratio entails obtaining the best estimates of the (relative) risk in each ward subject to the restriction that these estimates do not increase as we move away from S; the test then assesses how much better the data accord with these estimates than with the null hypothesis of constant risk. The Poisson maximum test is effectively based on the maximum value of the relative risk as we aggregate wards ordered by distance from S into a region of increasing size. These two tests are intended to be reasonably powerful against a wide class of alternative patterns of risk, provided at least that risk is supposed to decrease with distance. Their very generality, however, can be expected to militate against their optimality in any specific case.

Each of the four test statistics considered can be assessed for significance by a simulation in which are derived a large number (n, usually 1000) of typical values that would be obtained if the null hypothesis of no association were really true. If k of these equal or exceed the value obtained from the real data then the ratio k/n is an unbiased estimate of the true significance level or P value that would be obtained if we knew the true null distribution of the statistic.

The full arguments for and against these four test statistics (linear risk score tests based on rank and distance, Stone's maximum likelihood ratio, and Poisson maximum) will be discussed in a forthcoming paper. In summary, the linear risk score tests seem to be generally more powerful than Stone's tests for alternatives with only a moderate degree of non- uniformity of risk. Use of the reciprocal of distance in the linear risk score test would be particularly appropriate for detecting an environmental hazard declining with distance, and it also has the advantage that it is relatively insensitive to the precise location assumed for the risk source S. Use of rank, on the other hand, accords better with the notion that it may be relative proximity of residence that is important rather than actual measured distance, as for example where occupational exposure is important. Rank would, for instance, be a reasonable indicator of risk under any model which implies that a randomly selected worker is somewhat more likely to live in a nearer ward than a more distant one. It also has the advantage over the reciprocal of distance that it is less sensitive to variations in population distribution, though it is more sensitive to the precise location of S.

Choice of test

As a result of these studies we were faced with a choice among these test statistics, which might well have been influenced by our understanding of what they would actually achieve on the definitive data. We therefore prepared a theoretical paper discussing their relative merits but not including any results, apart from their performance at Sellafield and an indication of their sensitivity to the radius of the circular region used. We submitted this theoretical paper to a colleague with no familiarity with the data. We agreed to use his choice as the definitive test but at the same time to report also the results of a second test to the extent that they were different. This second test would be chosen to ensure that, for each of the four possible selections, analyses would be available that we felt would be regarded as appropriate by epidemiologists familiar with these types of data. The definitive test statistic chosen was the linear risk score test with the reciprocal of the distance rank; we also report the results of applying Stone's maximum likelihood ratio test.

Any of the tests considered could be carried out on either a conditional or an unconditional basis; the former assumes that the number of cases within a circular region is fixed and effectively ignores the extent to which the total number of cases observed exceeds or falls short of that expected. Such tests can certainly detect an interesting spatial pattern but may return a significant value resulting from a deficit below the calculated expectation in wards remote from the source. The unconditional tests, on the other hand, largely avoid this pitfall by taking into account the actual expectations. By this argument the unconditional tests are more appropriate whenever the registration data are thought to be reliable; we accordingly used unconditional tests and sampled from the relevant Poisson distributions in the calculation of the P values. Finally, we regard the tests as being one tailed as the alternative hypothesis of interest is that rates are higher closer to the source.


Tables I to IV show the results of applying the linear risk score test to the data on leukaemia and non-Hodgkin's lymphoma as described. For each of the entries in the tables a comparison of observed and expected counts shows that there is no evidence of an increase in incidence within 25 km of the sites considered; indeed none of the sites showed an excess incidence significant at the 5% level. Nor is there any evidence for a general effect of spatial proximity to the sites as judged by the unconditional linear risk score test; the only significant results were for Sellafield, Burghfield, and one of the control sites.


Details of observed and expected numbers of cases of childhood leukaemia and non-Hodgkin's lymphoma in 25 km regions around Nuclear Electric generating stations

View this table:

Details of observed and expected numbers of cases of childhood leukaemia and non-Hodgkin's lymphoma in 25 km regions around other nuclear installations emitting non-negligible quantities of radioactivity in study period

View this table:

Details of observed and expected numbers of cases of childhood leukaemia and non-Hodgkin's lymphoma in 25 km regions around other nuclear establishments believed to have emitted insignificant radioactivity in study period

View this table:

Details of observed and expected numbers of cases of childhood leukaemia and non-Hodgkin's lymphoma in 25 km regions around sites investigated regarding suitability for nuclear electricity generating stations

View this table:


Sellafield (table II) returned a highly significant P value of 0.00002 with both the linear risk score and the maximum likelihood ratio tests; 100 000 simulations were used. The excess of cases in Seascale was, of course, well known beforehand and has been the subject of detailed investigation.17, 21 The result is entirely due to the six cases in Seascale; if this ward is omitted the linear risk score test returns a P value of 0.517. The reason for the excess remains unclear; we did not expect our tests to throw any further light on the question.

Hinkley point

The linear risk score test result for Hinkley Point (table I) was not significant at the conventional 5% level (P=0.100); the P value with Stone's maximum likelihood ratio test was 0.139. This Nuclear Electric generating station has already been the subject of scientific investigation; Ewings et al reported 19 cases of leukaemia and non-Hodgkin's lymphoma in young people (up to age 25 years) within 12.5 km of the power station in the years 1964-86, compared with 10.4 expected.6 As pointed out by Taylor,18 however, their report to the Somerset Health Authority19 makes it clear that the excess is concentrated in the 15-24 year age group, which does not contribute to our study. Moreover, Ewings et al used national registration rates to calculate the expectation; Alexander et al have found higher rates in Somerset generally, and use of these must clearly reduce the rate ratio observed.20

Aldermaston and burghfield

An excess incidence of childhood leukaemia in the vicinity of Aldermaston (table II) and Burghfield (table III) was reported by Roman et al3 and has also been fully investigated by the Committee on Medical Aspects of Radiation in the Environment.5 The excess originally reported is concentrated in the 0-4 age group and was subsequently found to extend also to other childhood cancers in this age group. Possible causes for this excess have been investigated by means of a case control study, the results of which do not account for the observed excess.4 Our data overlap with those analysed by Roman et al3 but cover a wider span of years. Both sites show an incidence ratio of 1.10 in our study, but this is not significant; moreover, the overlap of the circles means that the results for the two sites are not independent. The linear risk score test is significant beyond the 5% level for Burghfield (P=0.031 with 10 000 simulations) but not for Aldermaston (P=0.499). This suggests that if proximity to one or other source is the explanation for the excess incidence the source must surely be Burghfield. The fact that Burghfield has much lower levels of emission than Aldermaston makes such emissions a less plausible explanation of the raised incidence. Stone's maximum likelihood ratio test gave significance levels of 0.12 for Burghfield and 0.51 for Aldermaston.


The United Kingdom Atomic Energy Authority research station at Winfrith has also been the subject of previous scrutiny. After a report by the South Coast Radiation Elimination Action Movement (SCREAM) in 1984, the district medical officer for the East Dorset Health Authority investigated local incidence rates and concluded that it was “unlikely that the explanation [of excess incidence] lies with the presence of the Atomic Energy Establishment at Winfrith.”7 The contention of the action movement was that radioactive particles were blowing off the mud in Poole harbour; it was therefore concerned more with clusters of cases found to the north of Bournemouth. Our analyses did not, of course, address these aggregations of cases. Our results were not significant (table II; P=0.132 for the linear risk score test, 0.438 for the maximum likelihood ratio).

Potential site c

Of the six potential sites considered and shown in table IV, one was found to be significant beyond the 5% level (site C, P=0.020 for the linear risk score test with 10 000 simulations). This site was in a rural district near the coast, and the excess is due to a “cluster” of three cases in a neighbouring village. We regard this as a chance finding. None of the potential sites was significant beyond the 5% level when Stone's maximum likelihood ratio test was used; the value for site C was P=0.055. Furthermore, the circular region around the site showed no excess incidence overall.

In addition to testing the individual sites examined we carried out combined tests on each of the groups defined by tables I to IV. We show at the bottom of each table a P value based on the distribution of the minimum of the significance levels appearing in the body of the table; this is a refinement of the method usually known as Bonferroni's method. It will be seen that the only group of sites that shows overall significance by this criterion is that in table II, due to the excess near Sellafield. A similar adjustment to the group of all nuclear installations (tables I to III) gives a value of P=0.0005, a result which is still highly significant.


This study was designed to determine whether there is any evidence in the vicinity of other nuclear installations in England and Wales of the type of effect observed at Sellafield. Because this effect is highly localised - there is in fact no significant increase in observed risk in the 25 km circle around the plant - our emphasis has been on using a statistical test capable of detecting this kind of concentration. The health impact of any such effect, however, would be better assessed in terms of the local incidence rate rather than by a test designed to detect spatial relation.

In fact, none of the sites we examined showed a significantly raised incidence ratio and, apart from Sellafield, the evidence of any spatial relation is also extremely weak. The only other installation to show a significant result over a 25 km circle for our chosen test - the linear risk score (1/rank) test - was Burghfield, a site which has been previously reported and at which radioactive emissions seem unlikely to be responsible for increased incidence. Given that 22 nuclear related sites apart from Sellafield are represented in tables I to IV, we should expect each test to give one result significant at the 5% level just by chance, even if there is no excess risk. The finding of a significant result at a potential site has no obvious explanation. The possibility that it might be a consequence of some special characteristic of areas typically chosen for nuclear installations is implausible as this would predict a general increase in incidence rather than a spatial relation to a virtually arbitrary point. It seems inescapable that this result is due to chance.

There has been much speculation and considerable research concerning the increased risk near Sellafield, which seems to be confined to the village of Seascale. A comprehensive analysis of incidence rates in this area has been given by Draper et al.17 This confirmed the excess of leukaemias and non-Hodgkin's lymphomas in Seascale but found no evidence of an increase in the two nearest county districts - Allerdale and Copeland (which contains Sellafield). The most detailed investigation into the causes of the increase in Seascale is the case control study carried out by Gardner et al.21 The authors concluded that their finding of an association between childhood leukaemia and paternal exposure before conception to relatively high doses of radiation could explain the geographical excess observed in Seascale. More recent studies have suggested that this association is not causal; in particular, Kinlen has argued that this factor would not explain the excess for children diagnosed in but born outside Seascale and thus that if there is a single cause it cannot be paternal preconceptional irradiation.22

The only other sites that have been studied in detail are Aldermaston/Burghfield4 and Dounreay23, 24 in Scotland; Dounreay is outside the scope of the present study. No coherent pattern or explanation has emerged from these studies. In particular, they have failed to confirm the finding of Gardner et al21 and, although the Aldermaston/Burghfield study produced some slight evidence of an increased incidence among the offspring of workers who wore radiation badges, there was no suggestion that the corresponding geographical excess could be so explained; moreover, the doses registered by the badges were very small.

Alternatives to radiation as an aetiological explanation of excess leukaemia near nuclear installations have been considered. Thus Cook- Mozaffari et al analysed data on mortality around sites where the construction of nuclear installations had been considered or had occurred at a later date (“potential sites”).25 They found that “excess mortality due to leukaemia and Hodgkin's disease in young people [0-24 years] who lived near potential sites was similar to that in young people who lived near existing sites” and suggested that “existing and potential sites might share unrecognised risk factors.” This suggestion provided the motivation for our inclusion of the potential site in table IV; as argued above, however, the hypothesis would predict an increase in general incidence rather than a spatial relation such as we found in one case. More recently Kinlen in a series of papers has tested the hypothesis that the incidence of childhood leukaemia can be increased by (mainly rural) population mixing.24, 26

Whatever the reason for the excess incidence at Seascale it seems clear from our study that there is virtually no convincing evidence for a geographical association of childhood leukaemia and non-Hodgkin's lymphoma with nuclear installations in general. It is generally more difficult to draw negative conclusions from epidemiological data, if only because negative results may be due to inappropriate methods or to the use of tests with low power. In the present study it might, for instance, be argued that a more appropriate analysis would be one based on place of birth rather than diagnosis, as an analysis based on place of diagnosis may fail to detect an effect of prenatal or preconception factors. Unfortunately, a small area analysis for birth data is at present not possible.

The scope of geographical studies which are restricted to analysing the locations of cases of a disease is obviously limited, and there is always a case for following up positive results by more detailed study of the individuals concerned where this is feasible. As a general approach to a large number of possible sources of risk, however, geographical studies can provide useful general pointers. They are in line with the popular perception that clusters of cases near particular putative risk sources are important; as long as there is public concern about such clusters on the basis of mere proximity there is a case for careful statistical analyses of such proximity.

We thank Nuclear Electric plc for financial assistance in carrying out this work and our colleagues there for numerous useful discussions. We also express our appreciation of the help in data acquisition and preparation that we received from the staff of the Childhood Cancer Research Group, particularly Mr Tim Vincent, from the United Kingdom Children's Cancer Study Group, and from cancer registration staff. Finally we thank our statistical colleague for helping to make objective the choice of test used and Dr Leo Kinlen for his helpful comments on earlier drafts of this paper. The Childhood Cancer Research Group is supported by the Department of Health and the Scottish Home and Health Department; part of the work of the present study was supported by the National Radiological Protection Board.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
View Abstract