Intended for healthcare professionals

Information In Practice

The diabetes audit and research in Tayside Scotland (darts) study: electronic record linkage to create a diabetes register

BMJ 1997; 315 doi: (Published 30 August 1997) Cite this as: BMJ 1997;315:524
  1. Andrew D Morris, senior lecturera,
  2. Douglas IR Boyle, computer programmerb,
  3. Ritchie MacAlpine, research nurseb,
  4. Alistair Emslie-Smith, general practitionerc,
  5. Roland T Jung, consultant physiciand,
  6. Ray W Newton, consultant physiciand,
  7. Thomas M MacDonald, clinical reader

    for the DARTS/MEMO Collaboration

  1. a University Department of Medicine, Ninewells Hospital and Medical School, Dundee DD1 9SY
  2. b Medicines Monitoring Unit, Ninewells Hospital and Medical School, Dundee
  3. c Wallacetown Health Centre, Dundee DD4 6RD
  4. d Diabetes Centre, Ninewells Hospital and Medical School, Dundee
  1. Correspondence to: Andrew D Morris
  • Accepted 19 June 1997


Objectives: To identify all patients with diabetes in a community using electronic record linkage of multiple data sources and to compare this method of case ascertainment with registers of diabetic patients derived from primary care.

Design: Electronic capture-recapture linkage of records included data on all patients attending hospital diabetes clinics, all encashed prescriptions for diabetes related drugs and monitoring equipment, all patients discharged from hospital, patients attending a mobile unit for eye screening, and results for glycated haemoglobin and plasma glucose concentrations from the regional biochemistry database. Diabetes registers from primary care were from a random sample of eight Tayside general practices. A detailed manual study of relevant records for the 35 144 patients registered with these eight general practices allowed for validation of the case ascertainment.

Setting: Tayside region of Scotland, population 391 274 on 1 January 1996.

Main outcome measures: Prevalence of diabetes; population of patients identified by different data sources; sensitivity and positive predictive value of ascertainment methods.

Results: Electronic record linkage identified 7596 diabetic patients, giving a prevalence of known diabetes of 1.94% (0.21% insulin dependent diabetes, 1.73% non-insulin dependent): 63% of patients had attended hospital diabetes clinics, 68% had encashed diabetes related prescriptions, 72% had attended the mobile eye screening unit, and 48% had biochemical results diagnostic of diabetes. A further 701 patients had isolated hyperglycaemia (plasma glucose >11.1 mmol/l) but were not considered diabetic by general practitioners. Validation against the eight general practices (636 diabetic patients) showed electronic linkage to have a sensitivity of 0.96 and a positive predictive value of 0.95 for ascertainment of known diabetes. General practice lists had a sensitivity of 0.91 and a positive predictive value of 0.98.

Conclusions: Electronic record linkage was more sensitive than general practice registers in identifying diabetic subjects and identified an additional 0.18% of the population with a history of hyperglycaemia who might warrant screening for undiagnosed diabetes.

Key messages

  • It has been recommended that regional registers of patients with diabetes are established in order to facilitate effective monitoring and treatment of diabetes

  • In Tayside we created a diabetes register by record linkage of multiple data sources: all patients attending hospital diabetes clinics, all encashed prescriptions for diabetes related drugs and monitoring equipment, all patients discharged from hospital, patients attending a mobile unit for eye screening, and results for glycated haemoglobin and plasma glucose concentrations from the regional biochemistry database

  • This register identified 7596 patients with diabetes in Tayside, giving a prevalence of diabetes of 1.94%

  • Record linkage was more sensitive than general practice registers in ascertaining cases of known diabetes

  • A unique patient identifier, the community health number, was fundamental for successful record linkage


Identification of all diabetic patients in the population is essential if diabetes care is to be effective in achieving the targets of the St Vincent declaration.1 Registers of patients with insulin dependent diabetes are relatively common,2 but there are few comprehensive registers of non-insulin dependent diabetes in the United Kingdom. The impact of non-insulin dependent diabetes has been grossly underestimated in the past, a fact highlighted by the recent report of the King's Fund Policy Institute commissioned by the British Diabetic Association.3 The challenge is therefore to establish population based monitoring and control systems by means of state of the art information technology in order to achieve quality assurance of the provision of health care for diabetic patients.4

The conventional approach to creating a diabetes register is by aggregating records held by general practices of patients with diabetes5 6 or by integrating general practice registers with lists of patients who attend hospital diabetes clinics.7 An alternative approach is central linkage of records specific for diabetes. The relative merits of registers derived from community sources (“grass roots”) and those abstracted and held centrally are open to debate. One aim of the diabetes audit and research in Tayside Scotland (DARTS) study was to test the hypothesis that linkage of electronic records from multiple independent sources is an efficient and more effective method than general practice lists for identifying all diabetic patients.

Subjects and methods

The DARTS study is a joint initiative of the Department of Medicine and the Medicines Monitoring Unit (MEMO) at the University of Dundee, the diabetes units at three Tayside healthcare trusts (Ninewells Hospital and Medical School, Dundee; Perth Royal Infirmary; and Stracathro Hospital, Brechin), and a large group of Tayside general practitioners with an interest in diabetes care. Tayside is a geographically compact region in which health care is administered by 278 general practitioners in 78 practices and three healthcare trusts. The 391 274 residents of Tayside alive on 1 January 1996 were used as the basis for the study.

Patient identification

Every patient who is registered with a general practitioner in Scotland is allocated a unique identifying number called the community health number. This consists of 10 digits, with the first six digits being the date of birth. Every resident of Tayside who is thus registered appears in the centrally held, continuously updated computerised, record the Community Health Master Patient Index. This file contains data on patients' address, postcode, general practitioner, death, and date of death. Thus, the demographic breakdown of the Tayside population, deaths, and patient migration can be easily analysed using these data. The community health number is used as the patient identifier in all healthcare activities in Tayside, both in primary and secondary care.

Data sources for electronic record linkage

We used eight independent data sources to maximise complete ascertainment of cases of diabetes.

Diabetes prescriptions database generated by the Medicines Monitoring Unit—This unit, which is a university based organisation supported by the Medicines Control Agency, has been described in detail elsewhere.8 9 Briefly, it has devised a method of capturing person specific dispensing for the whole of Tayside and, since January 1993, has recorded over 10 million prescription items specified by community health number. Of these items, we identified all prescriptions for antidiabetic drugs (insulin, sulphonylureas, biguanides, and α-glucosidase inhibitors) and for diagnostic and monitoring devices for diabetes (such as test strips and meters).

Hospital diabetes clinics—We integrated four databases: those of diabetes clinics from Ninewells Hospital, Dundee; Stracathro Hospital, Brechin; and Perth Royal Infirmary as well that of a young adult and paediatric clinic at Ninewells Hospital. These sources contain data on the monitoring and outcome of diabetes that conform to the United Kingdom diabetes dataset.7

Data from a mobile diabetes eye unit that has operated within Tayside since 1990 to perform community screening for diabetic retinopathy.10 Information on type and duration of diabetes are collected routinely. Every general practice in Tayside is invited to refer all patients with diabetes for pre-arranged retinal photography.

The regional biochemistry database—We analysed the records of all concentrations of glycated haemoglobin, plasma glucose, urinary microalbumin, and serum creatinine dating back to 1989. Identifying those measurements that were requested by a maternity unit allowed us to ascertain gestational diabetes. We accepted a diagnosis of diabetes if a patient had (a) an oral glucose tolerance test confirming the diagnosis of diabetes or (b) two or more random outpatient plasma glucose concentrations of greater than 11.1 mmol/l.

The Scottish morbidity record (SMR1), which is the list of hospital discharges for all hospitals in Tayside. In Tayside there are 63 000 hospital discharges each year, and the Medicines Monitoring Unit holds the Tayside portion of the Scottish morbidity record dating back to 1980. We created a computerised list of all patients resident in Tayside discharged with a primary or secondary diagnosis of diabetes (ICD code 250.0 (international classification of diseases)). For the purpose of this study, we classified diabetic patients as having insulin dependent diabetes if they were aged up to 35 years at diagnosis and were treated with insulin and having non-insulin dependent diabetes if they were treated by diet alone or oral hypoglycaemic agents or if they were aged over 35 at diagnosis, irrespective of treatment.

Diagnostic algorithm and database categorisation

The cut off date for the study was 1 January 1996. We used a hierarchical capture-recapture11 diagnostic algorithm to integrate each data source, with the community health number as the common patient identifier. Dedicated software (Microsoft Access) was written to define date of birth, sex, year of diagnosis, type of diabetes, duration of diabetes, method of treatment (insulin, oral therapy, or diet alone), and attendance at a hospital outpatient clinic for diabetes. Patients were defined as having definite diabetes if they met the criteria of the diagnostic algorithm. Other defined categories of dysglycaemia were gestational diabetes (fasting plasma glucose >5.5 mmol/l or plasma glucose >9 mmol/l at 2 hours after a 75 g oral glucose tolerance test12) and stress hyperglycaemia (random plasma glucose >11.1 mmol/l during an emergency inpatient admission).

Validation of electronic record linkage

To investigate the performance of the electronic record linkage, we performed a detailed manual validation of hospital records, biochemistry data, and general practice case records for all patients registered with eight randomly selected Tayside general practices (three rural and five inner city practices, 28 partners, 35 144 patients). We entered the validation data into a purpose written, clinical information system. In cases where the diagnosis of diabetes remained uncertain, the decision was made by an independent investigator, who applied the World Health Organisation's criteria for the diagnosis of diabetes.13

We also recorded separately the number of patients who were registered by their general practitioner as being diabetic.

Statistical analysis

We obtained denominators for the prevalence of diabetes from the community health master patient index. We calculated the sensitivity and positive predictive value of the electronic record linkage and general practice records and expressed the measure of agreement between the two approaches in ascertaining diabetes as κ.14

Ethical approval

The study was approved by the Tayside research and ethics committee. The databases of the DARTS study are registered under the Data Protection Act for purposes of research and audit only. At all times confidentiality of individual patients' and individual general practices' data were maintained. All electronic media containing patient specific data meet the requirements of health service data protection and encryption standards.15 16 17


Among the 391 274 residents of Tayside on 1 January 1996, electronic record linkage identified 7596 people with known diabetes. Of these, 4013 were male and 3583 were female with mean (SD) ages of 60 (17) years and 64 (18) respectively. The mean duration of diabetes was 6.0 (6.7) years. Figure 1 shows the age breakdown for insulin dependent and non-insulin dependent diabetes; the crude prevalence of diabetes in Tayside was 1.94%, and 5.3% of the population aged over 55 had diabetes. Table 1 shows the number of patients with known diabetes identified by each data source: 5141 were identified from encashed prescriptions, 4816 from hospital clinics, 5484 from the mobile eye unit, 3648 from the biochemistry database, and 2563 from hospital discharges. Eighty four per cent of these patients were identified by two or more data sources.

Table 1

Number of Tayside residents with known diabetes who were identified independently by each data source used in electronic record linkage

View this table:
Fig 1
Fig 1

Number of diabetic patients among residents of Tayside on 1 January 1996 by age and type of diabetes

In the validation study of eight randomly selected general practices, we identified 636 patients as having diabetes. Table 2 shows the number of diabetic patients identified by electronic record linkage and the number identified in the general practice registers. The sensitivity and positive predictive value of the electronic record linkage were 0.96 and 0.95 respectively, while the corresponding values for the general practice registers were 0.91 and 0.98. There was excellent agreement between the two approaches in ascertaining diabetes (κ=0.96).

Table 2

Ascertainment of cases of diabetes by electronic record linkage and by general practice registers in a random sample of eight general practices in Tayside*

View this table:

Because not all the data sources used by the electronic record linkage are available in other districts, we calculated the sensitivity and positive predictive value of each data source, both independently and in combination. For example, 68% of all patients with diabetes had encashed prescriptions for diabetes related drugs or monitoring equipment. Table 3 shows the proportions of patients identified by individual data sources and by combining data sources.

Table 3

Sensitivity and positive predictive value of each data source used in electronic record linkage for ascertaining cases of diabetes

View this table:

In addition to patients with definite diabetes, the electronic record linkage identified 47 patients with gestational diabetes on the cut off date and 701 patients with stress hyperglycaemia since the start of the study.


It has been recommended that regional diabetes registers are established in the United Kingdom to facilitate systematic, population based monitoring of outcomes of diabetes and to ensure that diabetes care is effective, efficient, and equitable.18 The task of developing such registers, especially in inner city areas, has proved difficult,19 with problems with case ascertainment, patient migration, and difficulty in identifying patients treated by diet alone.20 If the objectives of the St Vincent declaration are to be achieved1 the challenge is to devise robust strategies for the identification of all known diabetic patients in the community. The aim of this study was to evaluate a unique and more sensitive method for detecting cases of diabetes.

Cross sectional data of hospitalisation for diabetes substantially underestimate the incidence and impact of diabetes,21 and, as diabetes care is often performed exclusively in general practice, it has been assumed that comprehensive data are more likely to be obtained from primary care rather than the secondary sector.4 The most popular method of case identification has therefore been to aggregate general practice records of diabetic patients in the community. This strategy has been adopted in several localities; for example, 2236 patients in North Tyneside,5 4313 patients in Middlesborough,6 5200 patients in Sheffield,21 and 2574 patients in Tunbridge Wells,22 yielding prevalences ranging from 1.18% to 1.5%. The combination of general practice records with hospital records is an alternative approach that has been adopted in some districts.7 An entirely different strategy is to use information technology to electronically link centrally held data. The relative merits of these approaches are open to debate.

Electronic record linkage

Our results show that record linkage of eight independent data sources is a robust method for identifying all known diabetic patients in a community, with a sensitivity and positive predictive value of 0.96 and 0.95 respectively and yielding a point prevalence of diabetes of 1.94%. This prevalence is significantly higher than those previously reported.5 7 19 22 23 and may reflect the fact that general practice lists of patients with diabetes are not comprehensive. This was confirmed when we validated the electronic record linkage against a random sample of eight general practices and found the sensitivity of the general practice registers to be 0.91.

The ascertainment of diabetes by electronic record linkage was maximised because of the unique integration of multiple sources of data to create a patient specific information system. In this study such record linkage has a number of strengths. Firstly, the absolute specificity of hypoglycaemic drugs increases the completeness of ascertainment of diabetes. Secondly, by using the diabetes prescriptions database of the Medicines Monitoring Unit to identify those diabetic patients treated by diet alone who are prescribed glucose monitoring equipment but no drugs, we enhanced the capture of diabetes controlled by diet. Thirdly, biochemistry data further enhanced the capture of diabetes controlled by diet. Fourthly, the inhabitants of Tayside live in a well defined geographical area of both rural and inner city communities with a low rate of migration: for example, only 5% of nearly 4000 patients taking cimetidine were lost to follow up over five years in a previous study.24 Finally, all components of the database in the DARTS study are regularly updated, allowing a continuously updated and dynamic database.

To the best of our knowledge, this study is the first validated dynamic register of diabetes in the United Kingdom. It shows how clinical information can be harnessed electronically and exploited for the benefit of patients.

Requirements for creating a diabetes register

We believe our approach to the creation of a diabetes register to be efficient, requiring 12 months of a full time computer programmer and research nurse. Fundamental to the success of the study is the unique patient identifier, the community health number, for record linkage. Although a community health number is assigned to all residents of Scotland, Tayside is the only Scottish health board that uses it comprehensively to allow easy integration of all the data sources that the DARTS study uses. This study thus highlights the value of a unique patient identifier, which is soon to be introduced in England and Wales. Importantly, the DARTS study database conforms to the Scottish Intercollegiate Guidelines Network minimum recommended data set for data collection in diabetic patients.25

Comparison with other studies

The only other study to evaluate the performance of general practice records, hospital records, and data on consumption of antidiabetic drugs was performed in 4674 diabetic patients in Islington, London.18 Despite being population based, data collection was incomplete as only 52% of local general practices provided registers of diabetic patients and data on drug prescriptions were available for only 28% of practices. A major concern highlighted by this earlier study was that reliance on general practice registers and hospital records alone may result in 18-40% of cases of diabetes being missed.18 The sensitivities of practice registers, prescription returns, and hospital clinics in detecting diabetes were reported to be 0.62, 0.68, and 0.40 respectively. Our study has the advantage of having complete population data for encashment of prescriptions. Our corresponding sensitivities for prescription encashment and hospital records were 0.69 and 0.63 respectively.

Identifying undiagnosed diabetes

In addition to the 7596 patients with known diabetes, electronic record linkage identified 701 patients with a recent history of hyperglycaemia who were not recognised to be diabetic by their general practitioners. It is well established that a substantial number of cases remain undiagnosed, and surveys in the United Kingdom and United States suggest that the ratio of undiagnosed to diagnosed cases may be as high as 1:1.26 27 As the United Kingdom prospective diabetes study has shown that up to 30% of newly diagnosed patients with non-insulin dependent diabetes have evidence of microvascular or macrovascular complications at presentation,28 selective screening has been advocated in high risk groups.3 We propose to evaluate the effectiveness of screening for non-insulin dependent diabetes in the group of patients with a history of isolated hyperglycaemia.

Implications for creating a regional diabetes register

Since the data supplied by the Medicines Monitoring Unit are registered under the Data Protection Act for research use only, the DARTS study database in its current form is used solely for anonymised research and audit. The 14 general practitioner and physician members of the DARTS study steering group are custodians of the data, and no external agency has access to the data. This is perhaps a major strength of the system, as it allows non-threatening participation in audit and research by Tayside doctors. A key aspect for the acceptance of a regional diabetes register is for patients and healthcare professionals to have total confidence in the confidentiality and restricted use of the information. The database of the DARTS study cannot be considered for use as a regional register until the issue of patient confidentiality is addressed, which is a major hurdle to be overcome. One possible solution is the sharing of all diabetes specific information about each diabetic patient with that person on an annual basis to allow informed consent to be obtained. To foster collaborative links and reduce perceived threats to independence and change, every general practitioner and practice nurse in Tayside was invited to a series of meetings to canvas opinion and support for the DARTS study.

If the targets of the St Vincent declaration are to be achieved, accurate, population based monitoring of the status of diabetic patients is required. The DARTS study shows how electronic sources of data can be used to create a district diabetes register. The principles underlying this study are applicable elsewhere, and we are currently creating software that could be used elsewhere to link electronic data sources that are specific for diabetes.


Details of the DARTS study can be found on the Medicines Monitoring Unit (MEMO) web page at

We thank the general practitioners of Tayside and especially all members of the DARTS Steering Group: Dr A Connacher, Perth Royal Infirmary; Dr D Dunbar, general practitioner, Perth; Dr A Dutton, general practitioner, Perth; Dr A Emslie-Smith, general practitioner, Dundee; Dr S Greene, Ninewells Hospital, Dundee; Dr P Slane, general practitioner, Dundee; Dr B Kilgallon, general practitioner, Muirhead of Liff; Dr G Leese, Ninewells Hospital, Dundee; Dr A McKendrick, general practitioner, Carnoustie; Dr S Sawers, Stracathro Hospital, Brechin; and Dr A Young, general practitioner, Alyth.

Funding: The DARTS study is supported by the Scottish Home and Health Department, the Wellcome Trust, and the Robertson Trust. The Medicines Monitoring Unit (MEMO) is supported by the United Kingdom Medicines Control Agency.

Conflict of interest: None.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.