BMJ 1995;310:1511-1514 (10 June)

Education and debate

Using data from the 1991 census

F Azeem Majeed, lecturer in public health medicine,a Derek G Cook, senior lecturer in epidemiology,a Jan Poloniecki, lecturer in medical statistics,a David Martin, lecturer b

a Department of Public Health Sciences, St George's Hospital Medical School, London SW17 0RE, b Department of Geography, University of Southampton, Southampton SO17 1BJ

Correspondence to: Dr Majeed.


TABLE II--Approximate 95%
confidence intervals for the
difference between a sum from a
table containing modified cells
and the corresponding actual
value
----------------------------------
No of        Local
modified     base     Small area
cells     statistics  statistics
----------------------------------
5            +/-3       +/-2
10           +/-4       +/-3
20           +/-6       +/-4
50           +/-10      +/-7
100          +/-15      +/-10
200          +/-20      +/-15
500          +/-35      +/-25
----------------------------------
Source: 1991 census user guide 48.



Box 3--Effect of data modification on
small area statistics
SAS table 12--Long term illness in households:
residents in households with limiting long term
illness. Data taken from an enumeration district in
Wandsworth Health Authority
---------------------------------------------------------------
Age      Total persons      Males     Females
---------------------------------------------------------------
All ages      35            17         18
0-15           2             1          1
16-29          3             2          1
30-44          3             2          1
45-59          6             3          3
60-64          4             2          2
65-74          8             4          4
>/=75          9             3          6
---------------------------------------------------------------
 Cells in SAS tables are modified by randomly adding
+1, -1, or 0. The number of people aged 65-74 with
long term illness is based on two cells, each of which
has been modified by adding a number between +1
and -1. The number of people aged 65-74 with long
term illness (8) therefore differs from the true total by
+/-2. The total number of people with long term illness
in this enumeration district (35) is the sum of 14 cells
and differs from the true total by +/-14; the true number
of people with long term illness living in this enumeration
district therefore lies between 21 and 49. However,
extreme results are unlikely, and there is a 95%
probability that the true total lies between 29 and 41.


DATA SUPPRESSION FOR AREAS WITH SMALL POPULATIONS
Small area statistics are released only for enumeration districts with 50 or more residents and 16 or more households. For enumeration districts below either of these thresholds, the only data that are available are the total number of people present, the total number of residents, and the total number of resident households. All other small area statistics for these enumeration districts are combined with those for a neighbouring enumeration district, provided that the combined total number of people and households is greater than the minimum thresholds. Local base statistics are released only for those wards with 1000 or more residents and 320 or more households. As with enumeration districts, data from wards which fall below either threshold are combined with data from a neighbouring ward. Enumeration districts or electoral wards that contain data from neighbouring areas are categorised as importing zones.
SPECIAL ENUMERATION DISTRICTS
Enumeration districts that were expected to contain 100 or more people living in one or more communal establishments (defined as an establishment in which some kind of communal catering is provided, such as nursing homes, hospitals, and prisons) on the night of the census were defined as special enumeration districts. The residents of such establishments often have different social and demographic characteristics from people living in the surrounding area. Therefore, to prevent the residents of these establishments distorting the small area statistics for the enumeration district in which they are located, these areas were treated as special enumeration districts. The total population and total number of households for special enumeration districts are published, and if the enumeration district has 50 or more residents and 16 or more households, the other small area statistics are also published. If the number of residents or households falls below either of these thresholds, then these other small area statistics for the enumeration district are suppressed. The small area statistics for special enumeration districts are not combined with those for a neighbouring enumeration district but are included in electoral ward totals.
SHIPPING ENUMERATION DISTRICTS
Every electoral ward contains one shipping enumeration district (a separate category of special enumeration district). Apart from houseboats on inland waterways (which are classified as households), vessels are treated in a similar way to communal establishments, with data on people enumerated while living on vessels published in a separate shipping enumeration district. For most areas, there are no residents of shipping enumeration districts, and no data will be present for shipping enumeration districts.
DATA AGGREGATION
Because of data modification and data suppression, statistics derived by aggregating data for enumeration districts will differ from counts obtained directly from the small area statistics or local base statistics for the larger area. For example, if enumeration district counts of people with chronic illness are summed to calculate the total number of people with chronic illness living in a health authority, this total will differ from the total obtained directly from the local base statistics or small area statistics for the health authority. The difference arises because many people with limiting long term illness may live in special enumeration districts for which census data are suppressed, and also because of OPCS's policy of adding +1, -1, or 0 counts in enumeration district tables. To minimise the effect of this suppression and modification of data, census data from tables for the largest geographical area of interest should be used. For example, if census data for a health authority are required, then these data should be obtained from a table of census data for the health authority and not by aggregating data for enumeration districts or electoral wards within the health authority.
UNDERENUMERATION
There was a greater problem with underenumeration in the 1991 census than in previous censuses: no census data were obtained for 2.2% of the population. Although this is a relatively small percentage of the population, underenumeration was not random and was highest in inner city areas and among men aged 20-29 years; about 9% of men in this age group nationally, and nearly 20% in inner London, were not enumerated. Underenumeration will therefore lead to errors in census data, especially for inner London, and to a lesser extent for other inner city areas.
MIGRATION
The census is carried out every 10 years and because of inward and outward migration, 1991 census data for small areas will become progressively more inaccurate with time. OPCS does attempt to estimate the effects of internal migration within England and Wales by using data from family health services authority age-sex registers, and of international migration by using the international passenger survey. However, these estimates are unreliable at small area level.
Using postcodes to assign census data to individuals
One of the most common ways in which census data are used is to estimate the social and ethnic characteristics of people on the basis of their postcode. This is usually done by assigning a postcode to an enumeration district and using the census data for this enumeration district to characterise people living at the postcode. This process requires a method to assign postcodes to enumeration districts. After the 1981 census, the most common method was to use the grid references of postcodes and enumeration districts. However, Gatrell and Reading and Openshaw suggested that this method is often inaccurate, with about 40% of people being assigned to an incorrect enumeration district. This applied in England and Wales; in Scotland, postcodes are mapped directly to enumeration districts, with no boundary problems.
To attempt to overcome this problem and improve the geographical referencing of data from the 1991 census, the postcode of each household and community establishment was collected on the census form and put on the computer record. This allowed OPCS to produce a new look up table (the postcode-enumeration district directory) by directly linking the postcode on the census form to the enumeration district in which that postcode fell. However, some postcodes lie in more than one enumeration district, which can make it difficult to assign a postcode to an enumeration district. To help overcome this problem, the look up table contains some additional information: the "pseudo-enumeration" district of the postcode (the enumeration district in which most people at that postcode live); and the number of households in each postcode-enumeration district intersection (known as part postcode units). The table also contains the Ordnance Survey grid reference of the postcode, but this is only given to a precision of 100 metres. However, the mapping of postcodes should improve in the future as a result of the Ordnance Survey's Address-Point initiative, which aims to allocate a grid reference to every address in England and Wales to an accuracy of one metre. The main limitation of the look up table is that it contains only the postcodes that existed at the time of the 1991 census and will become progressively more out of date with time, as new postcodes are created (for example, because of the construction of new residential estates).

Key messages

  • Key messages

  • Census data have an important role in helping to plan health services and in monitoring how health services are being used, and they are freely available to both NHS employees and employees of academic institutions

  • Census data for small areas contain inaccuracies, but the effect of these errors is substantially reduced by either aggregating data or by using data for larger areas

  • Data from the 1991 census were used to produce a new postcode to enumeration district look up table, which overcomes many of the problems encountered with older methods of assigning postcodes to enumeration districts

ASSIGNING ENUMERATION DISTRICTS TO INDIVIDUALS
Assigning an enumeration district to an individual living at a postcode which relates to only one enumeration district is straightforward. However, if the postcode relates to more than one enumeration district then assigning an enumeration district can be done in two ways: by using the number of households in a part postcode unit to determine the probability of individuals living in each part postcode unit (this is the preferred option); or by assigning all individuals living at a postcode to its pseudo-enumeration district (box 4).

Box 4--Obtaining weighted average of census data for individuals
by using their postcode
------------------------------------------------------------------------------------------
         Enumeration district          Enumeration district
             ANFM12                       ANFM16
         (Postcode
          SW8 4JD)
                              (Postcode
                               SW8 4HX)
------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------
                                                                    Proportion of people
                                                                    living in households
                                                                       without a car
------------------------------------------------------------------------------------------
                      Pseudoenumeration                              In
         Enumeration                         No of               enumeration   In postcode
Postcode   district        district       households   Weight      district    (estimated)
------------------------------------------------------------------------------------------
SW8 4JD   ANFM12          ANFM12               5        1.00         0.76         0.76
SW8 4HX   ANFM12          ANFM12              17        0.85         0.76         0.75
SW8 4HX   ANFM16          ANFM12               3        0.15         0.68         0.75
------------------------------------------------------------------------------------------
Consider two postcodes, SW8 4JD and SW8 4HX, for which we wish to know the
likelihood that people living at these postcodes are living in households without a car.
The intersection between postcode SW8 4JD and the enumeration district ANFM12 is
known as a part postcode unit. Because SW8 4JD lies entirely within enumeration
district ANFM12, it produces only one part postcode unit, and the people living at this
postcode have a 0.76 probability of living in a household without a car.
 Postcode SW8 4HX lies in two enumeration districts, ANFM12 and ANFM16, and
hence produces two part postcode units. There are 20 households in this postcode, of
which 17 (85%) lie in enumeration district ANFM12 and three (15%) in enumeration
district ANFM16. The probability of living in a household without a car at this
postcode is therefore estimated as 0.75 (=(0.76x0.85)+(0.68x0.15)). An alternative
approach is to use data from the pseudo-enumeration district for the postcode.



Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?

This article has been cited by other articles:

  • Bhui, K. S., McKenzie, K. (2008). Rates and Risk Factors by Ethnic Group for Suicides Within a Year of Contact With Mental Health Services in England and Wales. Psychiatr. Serv. 59: 414-420 [Abstract] [Full text]  
  • Ward, P R, Noyce, P R, St Leger, A S (2004). Are GP practice prescribing rates for coronary heart disease drugs equitable? A cross sectional analysis in four primary care trusts in England. J. Epidemiol. Community Health 58: 89-96 [Abstract] [Full text]  
  • Soomro, G. M., Burns, T., Majeed, A. (2002). Socio-economic deprivation and psychiatric referral and admission rates -- an ecological study in one London borough. Psychiatr. Bull. 26: 175-178 [Abstract] [Full text]  
  • Majeed, A., Bardsley, M., Morgan, D., O'Sullivan, C., Bindman, A. B (2000). Cross sectional study of primary care groups in London: association of measures of socioeconomic and health status with hospital admission rates. BMJ 321: 1057-1060 [Abstract] [Full text]  
  • Baines, D. L, Parry, D. J (2000). Analysis of the ability of the new needs adjustment formula to improve the setting of weighted capitation prescribing budgets in English general practice. BMJ 320: 288-290 [Full text]  
  • Reid, F. D A, Cook, D. G, Majeed, A. (1999). Explaining variation in hospital admission rates between general practices: cross sectional study. BMJ 319: 98-103 [Abstract] [Full text]  
  • Stewart, J. A, Dundas, R, Howard, R S, Rudd, A G, Wolfe, C D A (1999). Ethnic differences in incidence of stroke: prospective study with stroke register. BMJ 318: 967-971 [Abstract] [Full text]  
  • Pollock, A. M, Vickers, N. (1998). Deprivation and emergency admissions for cancers of colorectum, lung, and breast in south east England: ecological study. BMJ 317: 245-252 [Abstract] [Full text]  
  • Majeed, F A., Martin, D., Crayford, T. (1996). Deprivation payments to general practitioners: limitations of census data. BMJ 313: 669-670 [Full text]  
  • Meredith, S., Watson, J. M, Citron, K. M, Cockcroft, A., Darbyshire, J. H (1996). Are healthcare workers in England and Wales at increased risk of tuberculosis?. BMJ 313: 522-525 [Abstract] [Full text]  
  • Leese, M., Loftus, L., Thornicroft, G. (1995). Adjusting for underenumeration in the 1991 census. BMJ 311: 394-394 [Full text]  
  • Majeed, F A. (1995). Deprivation payments to general practitioners. BMJ 310: 1674a-1674 [Full text]  



Access jobs at BMJ Careers
Whats new online at Student 

BMJ