CCBYNC Open access
Research

Completeness and diagnostic validity of recording acute myocardial infarction events in primary care, hospital care, disease registry, and national mortality records: cohort study

BMJ 2013; 346 doi: http://dx.doi.org/10.1136/bmj.f2350 (Published 21 May 2013) Cite this as: BMJ 2013;346:f2350
  1. Emily Herrett, research fellow1,
  2. Anoop Dinesh Shah, clinical research fellow2,
  3. Rachael Boggon, research statistician34,
  4. Spiros Denaxas, senior research associate2,
  5. Liam Smeeth, professor of clinical epidemiology and general practitioner1,
  6. Tjeerd van Staa, professor of pharmacoepidemiology134,
  7. Adam Timmis, professor of clinical cardiology5,
  8. Harry Hemingway, professor of clinical epidemiology2
  1. 1London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK
  2. 2Department of Epidemiology and Public Health, Clinical Epidemiology Group, University College London, UK
  3. 3Clinical Practice Research Datalink Group, Medicines and Healthcare products Regulatory Agency, London, UK
  4. 4Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands
  5. 5Barts and the London School of Medicine and Dentistry, London, UK
  1. Correspondence to: E Herrett emily.herrett{at}lshtm.ac.uk
  • Accepted 26 March 2013

Abstract

Objective To determine the completeness and diagnostic validity of myocardial infarction recording across four national health record sources in primary care, hospital care, a disease registry, and mortality register.

Design Cohort study.

Participants 21 482 patients with acute myocardial infarction in England between January 2003 and March 2009, identified in four prospectively collected, linked electronic health record sources: Clinical Practice Research Datalink (primary care data), Hospital Episode Statistics (hospital admissions), the disease registry MINAP (Myocardial Ischaemia National Audit Project), and the Office for National Statistics mortality register (cause specific mortality data).

Setting One country (England) with one health system (the National Health Service).

Main outcome measures Recording of acute myocardial infarction, incidence, all cause mortality within one year of acute myocardial infarction, and diagnostic validity of acute myocardial infarction compared with electrocardiographic and troponin findings in the disease registry (gold standard).

Results Risk factors and non-cardiovascular coexisting conditions were similar across patients identified in primary care, hospital admission, and registry sources. Immediate all cause mortality was highest among patients with acute myocardial infarction recorded in primary care, which (unlike hospital admission and disease registry sources) included patients who did not reach hospital, but at one year mortality rates in cohorts from each source were similar. 5561 (31.0%) patients with non-fatal acute myocardial infarction were recorded in all three sources and 11 482 (63.9%) in at least two sources. The crude incidence of acute myocardial infarction was underestimated by 25-50% using one source compared with using all three sources. Compared with acute myocardial infarction defined in the disease registry, the positive predictive value of acute myocardial infarction recorded in primary care was 92.2% (95% confidence interval 91.6% to 92.8%) and in hospital admissions was 91.5% (90.8% to 92.1%).

Conclusion Each data source missed a substantial proportion (25-50%) of myocardial infarction events. Failure to use linked electronic health records from primary care, hospital care, disease registry, and death certificates may lead to biased estimates of the incidence and outcome of myocardial infarction.

Trial registration NCT01569139 clinicaltrials.gov.

Introduction

Electronic health records inform patient decision making and policy and are increasingly used to define disease and the outcomes of care in observational cohorts of genetic and environmental factors1 2 3 4 and randomised trials.5 6 7 Recent initiatives to expand the use of health records for research have been announced in many countries,8 9 10 11 and in the United Kingdom the National Health Service is now legally required to evaluate patient outcomes.12 The UK government has also recently announced plans to drive improvement in cardiovascular disease care through use of information in linked health records.13 In various settings across the world these initiatives are being met by the linkage of electronic health records from disparate sources. Underpinning these uses of electronic health records is the need for a better understanding of the quality of data within a single source as well as between multiple sources. Indeed, it is a concern that electronic records from one part of the health system, such as primary care, may not capture health events occurring in other parts of the health system, such as hospital care.

Video abstract

As part of the CArdiovascular disease research using LInked Bespoke studies and Electronic health Records (CALIBER) programme14 we carried out new linkage between records from primary care,15 hospitals, an acute coronary syndrome registry,16 and death certificates. Although data from these types of source are increasingly available in different countries,17 18 for acute myocardial infarction the overlap between these four electronic health record sources, patient risk factors, and subsequent mortality have not been compared. Previous cross referencing studies have typically compared one or two electronic sources, such as coded hospital discharge diagnoses and cause of death, with case note review,19 20 21 questionnaires to general practitioners,22 or active case finding in a prospective consented study1 23 24 (see supplementary table 1). Linkages with the national, ongoing acute coronary syndrome registry allowed detailed diagnoses of myocardial infarction (with coded electrocardiographic findings and markers of myocardial necrosis, not available in other sources) to be compared with diagnoses in primary care and hospital admissions. Linkages with primary care allowed evaluation of risk factors in patients with a record of acute myocardial infarction in any source. Linkages with the death record allowed evaluation of cause specific mortality of myocardial infarction recorded in any source, including among cases not admitted to hospital.

We compared the incidence, recording, agreement of dates and codes, risk factors, and all cause mortality of acute myocardial infarction recorded in primary care, hospital care, the national acute coronary syndrome registry, and the national death registry.

Methods

We used a cohort study design, identifying patients with acute myocardial infarction in four prospectively collected, linked electronic health record sources in England (the CALIBER programme14). Briefly, the CALIBER linkage included anonymised primary care electronic patient records from the Clinical Practice Research Datalink15 (www.cprd.com, formerly known as the General Practice Research Database), data on hospital admissions from Hospital Episode Statistics, the national registry of acute coronary syndromes (Myocardial Ischaemia National Audit Project, MINAP),16 and the death registry, curated by the Office for National Statistics (see supplementary table 2).

Of the 630 primary care practices in Clinical Practice Research Datalink, 244 consented to data linkage with Hospital Episode Statistics, MINAP, and the Office for National Statistics. These practices contained 3.9% of the population of England in 2006. The linkage was carried out in October 2010 by a trusted third party, using a deterministic match between NHS number, date of birth, and sex. Overall, 96% of patients with a valid NHS number were successfully matched.

Study population: patients with acute myocardial infarction

We identified records of acute myocardial infarction with reference to previously described definitions for each source. In primary care, diagnoses are recorded using Read codes25 and previous studies have published lists of Read codes used to identify acute myocardial infarction.26 27 We identified myocardial infarction using the 62 Read codes listed in supplementary table 3. In Hospital Episode Statistics and the Office for National Statistics death registry, diagnoses are coded using the International Classification of Diseases, 10th revision28; in common with previous studies,29 we defined acute myocardial infarction by ICD-10 codes I21 (acute myocardial infarction), I22 (subsequent myocardial infarction), or I23 (current complications following acute myocardial infarction). In Hospital Episode Statistics, to be included in our study myocardial infarction had to be recorded as the primary diagnosis in the first episode of an admission to hospital (where the first episode refers to the first period of care for an admitted patient overseen by a healthcare professional30). We performed a sensitivity analysis to assess the influence of inclusion of secondary diagnoses. In MINAP, ST elevation and non-ST elevation myocardial infarction were identified using hospital discharge diagnosis, markers of myocardial necrosis, and coded electrocardiographic findings, in accordance with the internationally agreed definition of myocardial infarction.31 For MINAP and Hospital Episode Statistics, we took the hospital admission date to represent the date of acute myocardial infarction.

The study period was 1 January 2003 to 31 March 2009 (when all record sources were concurrent) and confined to patients who had been registered with their general practice for at least a year and the practice had been submitting data for at least one year that met Clinical Practice Research Datalink data quality standards for continuity and plausibility of data recording. In the main analysis we included only patients with at least one record of admission to hospital in Hospital Episode Statistics at any time (for any cause) as these patients were shown to be linkable, but we conducted a sensitivity analysis including all patients. We selected the first record of myocardial infarction during the patient’s study period as the index event and considered myocardial infarction records in the other data sources as representing the same event if they were dated within 30 days of the index event.

Cardiovascular risk factors and non-cardiovascular coexisting conditions

For patients with acute myocardial infarction we identified risk factors recorded in primary care, including age, sex, social deprivation,32 smoking, use of antihypertensives or lipid lowering drugs, diabetes mellitus, Charlson comorbidity index,33 and primary care consultation rate before the event. We used mean measures of systolic blood pressure and total cholesterol and high density lipoprotein cholesterol levels before myocardial infarction along with age, sex, and smoking status (where these variables were present) to estimate the 10 year Framingham risk for acute myocardial infarction or coronary death.34 We used these measures to compare the cohorts of myocardial infarction identified in each data source.

Follow-up for mortality

We followed all patients with a record of myocardial infarction in any source for one year for death as recorded in the Office for National Statistics death registry. We categorised patients as having fatal or non-fatal myocardial infarction by whether they died of any cause within seven days of the myocardial infarction. If a patient had a myocardial infarction record in Clinical Practice Research Datalink, Hospital Episode Statistics, or MINAP after their date of death, we considered that they died on the day of their myocardial infarction.

Agreement in recording

If the time difference between the earliest date of acute myocardial infarction in one source and the date in another source was no more than 30 days we considered that the records of acute myocardial infarction in the different sources agreed. A myocardial infarction recorded more than 30 days after the earliest date was considered a new event and was not included, ensuring that each patient appeared only once in the analysis. We chose 30 days to account for any delay in recording of myocardial infarction in primary care, assuming that any record within 30 days of a hospital admission was likely to represent the same event and anything after 30 days could feasibly be a subsequent myocardial infarction. We carried out a sensitivity analysis using a 90 day threshold.

Statistical analysis

Incidence

We estimated population based incidence rates of fatal and non-fatal acute myocardial infarction using the denominator of all adults in the CALIBER primary care population aged 18 and over (2.2 million), followed up for a mean 4.1 years between 2003 and 2009. We used each of the data sources separately and together to identify incident myocardial infarction, ending the follow-up period for a patient on the date of their first myocardial infarction during the study period, death, or deregistration from the general practice.

Cardiovascular risk factors and non-cardiovascular coexisting conditions

We compared patients with fatal and non-fatal acute myocardial infarction identified in the four data sources for risk factors and coexisting conditions recorded in primary care.

Death after acute myocardial infarction

We produced cumulative incidence curves for coronary and non-coronary mortality for patients recorded in each data source and compared mortality using a Cox proportional hazards model adjusted for age and sex.

Agreement in recording

We would expect patients who survived seven days after myocardial infarction to be recorded in the primary care, disease registry, and hospital admissions sources, and we assessed agreement between these three sources in a Venn diagram. For patients who died within seven days, we examined the proportion recorded by each source but did not compare agreement across all four sources as we would not expect the hospital discharge data and disease registry to record patients who died before reaching hospital.

In patients who did not have a record of acute myocardial infarction in one or more data sources, we looked for other codes that may have been used to describe the event. In the disease registry, we looked for unstable angina or admission diagnoses of any acute coronary syndrome. In primary care and hospital discharge data, we sought other acute coronary syndromes, coronary disease, chest pain, or other cardiac diagnoses (for example, atrial fibrillation, heart failure, cardiac arrest). In primary care data we also examined codes indicating contact with secondary care. Where none of these codes was recorded, we tabulated all recorded codes in the 30 days before and after the date of myocardial infarction to see if there were any relevant codes we had overlooked.

We performed a logistic regression analysis to establish whether age, sex, deprivation, rate of primary care consultation, year of myocardial infarction, or mortality at 30 days explained suboptimal recording of acute myocardial infarction in primary care, hospital discharge, or disease registry sources.

We calculated the positive predictive value of primary care or hospital discharge diagnoses of acute myocardial infarction among patients who also had a record in the acute coronary syndrome registry. Data were analysed using Stata 12 and R 2.14.1.35

Results

We identified 21 482 patients with fatal or non-fatal acute myocardial infarction recorded in any of the four data sources.

Incidence

Among the single source crude estimates for incidence of myocardial infarction, primary care data (Clinical Practice Research Datalink) gave the highest estimate, of 187 per 100 000 patient years (95% confidence interval 184 to 190), followed by hospital discharge data (Hospital Episode Statistics) with 154 per 100 000 patient years (152 to 157), acute coronary syndrome registry (MINAP) with 115 per 100 000 patient years (113 to 118), and death registry with 45 per 100 000 patient years (43 to 46). Combining these three sources yielded an estimate of 243 per 100 000 patient years (239 to 246, fig 1). The crude incidence of acute myocardial infarction was 25% lower using only Clinical Practice Research Datalink and 50% lower using only MINAP compared with using all three sources. See supplementary figure 1 and table 4 for standardised incidence by age, sex, and region.

Figure1

Fig 1 Crude incidence of acute fatal and non-fatal myocardial infarction estimated using different combinations of data from primary care (Clinical Practice Research Datalink), hospital admissions (Hospital Episode Statistics), disease registry (MINAP, Myocardial Ischaemia National Audit Project), and death registry (Office for National Statistics)

Cardiovascular risk factors and comorbidity

Overall, the cohorts identified from the primary care, hospital, and disease registry sources had a similar prevalence of cardiovascular risk factors and comorbidities. However, compared with those recorded in the acute coronary syndrome registry or hospital discharge data only, patients with fatal or non-fatal myocardial infarction recorded only in primary care were on average two years younger and more likely to be current smokers and in the most deprived fifth (P<0.001 for these comparisons, also see supplementary table 5). Patients recorded by the death registry were older than patients recorded in the other sources and had a higher burden of risk factors reflecting their age. However, other demographic characteristics and cardiovascular risk factors were broadly similar across patients recorded in primary care and hospital care sources (table 1, also see supplementary table 5).

Table 1

Recording of risk factors in primary care before myocardial infarction recorded in primary care, hospital admission, disease registry, and death registry sources from 1 January 2003 to 31 March 2009

View this table:

Death after acute myocardial infarction

Patients with myocardial infarction identified in the disease registry had lower crude 30 day mortality (10.8%, 95% confidence interval 10.2% to 11.4%) than those identified in hospital care (13.9%, 13.3% to 14.4%) or in primary care (14.9%, 14.4% to 15.5%, fig 2). At one year, however, mortality was similar in all three groups, at around 20%.

Figure2

Fig 2 Kaplan Meier curves showing all cause mortality, stratified by record source in 20 819 patients: Clinical Practice Research Datalink (n=15 819), Hospital Episode Statistics (n=13 831), Myocardial Ischaemia National Audit Project (MINAP) (n=10 351). Myocardial infarctions recorded by the Office for National Statistics are not shown as they are by definition fatal on the date of myocardial infarction

In the linked data, patients with acute myocardial infarction recorded in only one source had higher mortality than those recorded in more than one source (age and sex adjusted hazard ratio 2.29, 95% confidence interval 2.17 to 2.42; P<0.001). Among patients with myocardial infarction recorded in only one source (Hospital Episode Statistics, Clinical Practice Research Datalink, or MINAP), those recorded only in primary care had the highest mortality on the first day but the lowest mortality thereafter (see supplementary figures 2 and 3). Among patients with myocardial infarctions recorded in one of Hospital Episode Statistics or MINAP but not both, those in MINAP had lower coronary mortality in the first month (age and sex adjusted hazard ratio 0.33, 0.28 to 0.39, P<0.001) but similar mortality for non-coronary events (1.12, 0.90 to 1.40, P=0.3). After the first month, patients with myocardial infarctions recorded only in primary care had about half the hazard of mortality of patients with myocardial infarctions recorded in one of MINAP or Hospital Episode Statistics (hazard ratio adjusted for age and sex for coronary causes 0.49, 95% confidence interval 0.40 to 0.60, P<0.001 and for other causes 0.57, 0.49 to 0.67, P<0.001). Of the 3518 patients with myocardial infarction recorded in any of the four sources who died of any cause within seven days, 54.4% (n=1914) had a myocardial infarction code recorded in primary care within 30 days. The underlying cause of death was acute myocardial infarction in 2924 patients (83.0%); a further 164 patients (4.7%) had ischaemic heart disease recorded as the underlying cause of death (ICD-10 code I20, I24, or I25), 60 (1.7%) had cerebrovascular disease (I60-I69), and 85 (2.4%) had respiratory disease (J00-J99). However, 3375 of these 3518 patients (95.9%) had a coronary diagnosis (I20-I25) as either the underlying cause or a secondary cause of death.

Fatal myocardial infarctions identified by death registry data (underlying cause of death ICD-10 I21, I22, or I23, n=2919) were unlikely to be recorded in hospital sources; 36.7% (n=1072) were recorded in Hospital Episode Statistics and just 17.1% (n=498) in the MINAP disease registry within 30 days, but 55.9% (n=1631) were recorded in primary care (see supplementary table 6).

Non-fatal acute myocardial infarction: agreement between record sources

Among the 17 964 patients with at least one record of non-fatal acute myocardial infarction, 13 380 (74.5%) were recorded by Clinical Practice Research Datalink, 12 189 (67.9%) by Hospital Episodes Statistics, and 9438 (52.5%) by MINAP. Overall, 5561 (31.0%) of patients had the event recorded in all three sources and 11 482 (63.9%) in at least two sources (fig 3). When we extended the recording window from 30 days to 90 days, the proportion recorded in all three sources increased only slightly, to 32.0% (n=5747). When we included patients who had never had a record of a hospital admission in Hospital Episode Statistics, the proportion of non-fatal myocardial infarctions recorded in all three sources decreased slightly to 30.0% (5561/18 536) and the proportion recorded only in primary care increased from 17.7% (3188/17 964) to 20.3% (3760/18 536). A sensitivity analysis in which the Hospital Episode Statistics case definition included secondary diagnoses of myocardial infarction (where myocardial infarction was not the reason for admission) produced only a slight increase in the proportion recorded in all three sources (5812/18 283, 32.0%), and identified 306 additional myocardial infarctions that were not in any other source.

Figure3

Fig 3 Number and percentage of records recorded in primary care (Clinical Practice Research Datalink), hospital care (Hospital Episode Statistics), and disease registry (Myocardial Ischaemia National Audit Project) for non-fatal myocardial infarction across the three sources (n=17 964 patients)

The exact date of admission agreed in over 80% of 6851 patients with acute myocardial infarction recorded in both hospital care and disease registry sources (see supplementary figure 4), but the date recorded in primary care was the same as the disease registry admission date or hospital admission date for only 50% of patients (n=15 753). There was a smaller peak in primary care recording between five and seven days after admission. When the time window was extended to 90 days, there was little change in these proportions.

Among patients with non-fatal myocardial infarction, 88.0% (8304/9438) of those recorded in MINAP and 89.1% (10 859/12 189) recorded in Hospital Episode Statistics had a Read code for any cardiac diagnosis or chest pain within 30 days in primary care, and in over 70% the Read code stated myocardial infarction (see supplementary table 6). Only 25.1% (3364/13 380) of the non-fatal myocardial infarctions recorded in primary care stated the type—that is, ST elevation or non-ST elevation—compared with 100% for the disease registry. If a non-fatal myocardial infarction was recorded in primary care, hospital discharge data recorded a cardiac diagnosis within 30 days in 84.9% (11 355/13 380) of patients, with a primary diagnosis of myocardial infarction in 72.6% (9720/13 380). However, this proportion varied depending on the Read term used to identify myocardial infarction in primary care; for terms that state the anatomical location (for example, acute anterolateral infarction) it was around 80% but was lower for less precise terms. For example, of the 74 patients with the Read term “heart attack,” only 32 (43%) had a primary hospital diagnosis of myocardial infarction. Supplementary tables 7-9 describe the agreement between sources according to the way in which acute myocardial infarction was recorded in each source.

Positive predictive value

For primary care or hospital discharge patients with an associated record in the disease registry (MINAP), the positive predictive value of the acute myocardial infarction diagnosis (the probability that the diagnosis recorded in the disease registry was myocardial infarction rather than unstable angina or a non-cardiac diagnosis) was 92.2% (6660/7224, 95% confidence interval 91.6% to 92.8%) in primary care and 91.5% (6851/7489, 90.8% to 92.1%) in hospital care (table 2). Eighty five per cent of patients recorded in primary care and hospital discharge (7386/8707) had a record of raised cardiac markers and half (3766/8707) had a record of ST segment elevation on electrocardiography.

Table 2

Information recorded in disease registry (MINAP) within 30 days for non-fatal myocardial infarction recorded in primary care (Clinical Practice Research Datalink, CPRD) or hospital admissions (Hospital Episode Statistics, HES). Values are numbers (percentages) unless stated otherwise

View this table:

Non-fatal acute myocardial infarction: reasons for disagreement

Compared with patients who had a record of acute myocardial infarction in only one source, those with records in multiple sources had a lower rate of primary care consultation before the event, were younger, were more likely to be male, and more likely to have experienced acute myocardial infarction in one of the later years of data collection. Among patients with myocardial infarction recorded in primary care, an additional record in Hospital Episode Statistics or MINAP was strongly associated with increased mortality at 30 days (see supplementary table 10).

Discussion

We compared electronic health records on one major disease event—acute myocardial infarction—across four English, ongoing sources of health record data: primary care (Clinical Practice Research Datalink), hospital admissions (Hospital Episode Statistics), a quality improvement disease registry (Myocardial Ischaemia National Audit Project, MINAP), and the death registry (Office for National Statistics). In over 20 000 patients each data source missed a substantial proportion of myocardial infarction events. We also found evidence for the validity of myocardial infarction recording across all sources, in terms of risk factor profiles and mortality at one year. Taken together, these findings support the wider use of linkage of multiple record sources by clinicians, policy makers, and researchers.

Fatal myocardial infarction

Both primary care and death registry data can be used to capture fatal myocardial infarction occurring out of hospital among people without a record of myocardial infarction in the Hospital Episode Statistics or disease registry (MINAP). The death registry is a useful source of fatal acute myocardial infarction for research, as most (83.0%) patients who were identified as having acute myocardial infarction in any of the data sources and died within seven days had myocardial infarction recorded as their underlying cause of death. These figures agree with results from the Oxford Record Linkage Study, where among 5686 patients admitted to hospital with myocardial infarction 85.2% who died within 30 days had myocardial infarction recorded as the underlying cause of death.36

Non-fatal myocardial infarction

Primary care captures most cases, but all sources miss non-fatal myocardial infarction

We found that each record source misses cases. Only one third of non-fatal myocardial infarctions were recorded in all three data sources (primary care, hospital admissions, and disease registry) and two thirds were recorded in at least two sources. Clinical Practice Research Datalink was the single most complete source of non-fatal myocardial infarction records (one quarter of all non-fatal myocardial infarction events not recorded), Hospital Episode Statistics missed one third, and MINAP missed nearly half (fig 3). This agrees with the results of other studies; a two source study of myocardial infarction in Scotland (see supplementary table 1) compared the incidence based on primary care records with that based on hospital data and showed that in combination they provided the highest estimates of incidence.37 Further two source comparisons in Australia,38 Denmark,39 and the Netherlands24 (see supplementary table 1) have shown that hospital records alone underestimate the true incidence of myocardial infarction. Despite the low sensitivity of these data sources, in our study the positive predictive value of myocardial infarction records in primary care and hospital admission sources were over 90% compared with the disease registry gold standard based on the international definition of myocardial infarction (table 2).

However, some of the myocardial infarctions recorded only in primary care are likely to be historical diagnoses because the associated mortality rate in the first month is much lower than those also recorded in hospital sources (see supplementary figure 2). Our results using cross referencing of electronic health records in 20 000 patients are consistent with previous manual approaches to validation in primary care, which cross reference a few hundred patients with disease diagnoses recorded using Read codes against anonymised free text, death certificates, paper medical records, or hospital discharge summaries, or questionnaires to general practitioners.40 41 42 43 44 45 46 47 48 Our much larger sample size, however, allowed us to evaluate individual Read terms that are used to record myocardial infarction. This type of validation has not been done previously for myocardial infarction and may be relevant to other common conditions that can be recorded using a variety of codes, such as stroke.49

Hospital admission data

To our knowledge, no studies have examined the positive predictive value of ICD-10 coded myocardial infarction diagnosis in hospital admission data against an ongoing disease registry. We found that 62.8% of non-fatal myocardial infarctions recorded in primary care and 72.6% recorded in the disease registry were recorded by hospital admissions data. This is consistent with a single electronic health record source (Hospital Episode Statistics ICD-10 I21 and I22) capturing 53% of myocardial infarctions in an investigator led cohort with active follow-up.50

Disease registry and maximising true positives

The strengths of the disease registry MINAP lie in the fact that its diagnostic records (troponin values, electrocardiographic findings, and cardiologist diagnosis of ST elevation and non-ST elevation myocardial infarction) are not available in other sources, which offer validated endpoints electronically from all hospitals in England and Wales. An acute myocardial infarction recorded in MINAP is thus an electronic health record gold standard, as a myocardial infarction recorded by a registry is likely to fulfil international diagnostic criteria.16 The registry may be important for detecting endpoints in cohort studies and trials, where false positives can dilute any observed effect and reduce the power of a study. Furthermore, in such studies it has been shown that avoiding false positives is more important than avoiding false negatives.51 Validation of myocardial infarctions recorded by primary care and hospital admissions against those recorded by the disease registry showed a positive predictive value of over 90%, making them suitable for detecting endpoints in cohort studies and trials, where poor endpoint resolution can dilute any observed effect and reduce the power of a study.51 The positive predictive value was not 100% because some myocardial infarction records in primary care may actually have been related to unstable angina or chest pain of an unknown cause.

Limitations of this study

Our data were from a sample of 244 English general practices contributing to the Clinical Practice Research Datalink. However, the primary care patients included in this CALIBER study are representative of the UK population. Furthermore, patients in practices that participated in the linkage were representative of the Clinical Practice Research Datalink as a whole in terms of age, social deprivation, body mass index, and prescription of key drugs.52 Hospital admissions with linked primary care data were also representative of all admissions to hospital in England in terms of the distribution of age, sex, and diagnostic group.53 The UK life science strategy aims to increase the proportion of the UK population with primary care data available for research linked electronic health record through the Clinical Practice Research Datalink.15 A second limitation concerns the generalisability of our findings for the quality of primary care data. Practices contributing data to the Clinical Practice Research Datalink are advised of recording guidelines and their data are accepted only when they meet standards of data completeness,15 so they are likely to record disease events better than general practices that do not contribute. Our estimates of agreement from this study may therefore be higher than for practices that do not contribute to the Clinical Practice Research Datalink. Thirdly, our validation of Hospital Episode Statistics and Clinical Practice Research Datalink myocardial infarctions against MINAP was (inherently) limited to the subset of patients with a MINAP record, and caution must be exercised in extending these conclusions to patients with myocardial infarctions without a MINAP record.

Clinical and policy implications

With the current emphasis on measuring clinical outcomes in health systems and recent plans to use linked data to drive improvements in the care of patients with cardiovascular disease,4 13 our study has important implications for practice and policy. Firstly, we propose much wider use of linked record sources in commissioning and in research to estimate disease occurrence and outcome, because of the biases inherent in using only one source of records. Changing the estimates for incidence of myocardial infarction could potentially alter the modelled effect of population based healthcare interventions. Our findings underscore the importance of international initiatives to accelerate availability of linked data in America,11 18 54 55 in Europe,56 and elsewhere. Secondly, a national strategy for biomedical informatics is required to tackle manifest system failings: a single health event should, ideally, have a single record that is propagated in multiple record systems. Efforts to reform the process of death certification is already underway,57 but this needs to be broadened to include other health records. Thirdly, and more specifically, primary care records could be improved if the admission date rather than discharge date was used to record the myocardial infarction (as reflected by the current “tail” of myocardial infarction records recorded up to 20 days after admission; see supplementary figure 4) and if acute myocardial infarction was recorded only for its occurrence rather than repeated entries for consultations related to a history of myocardial infarction. Fourthly, disease registries, such as MINAP, could be improved if embedded in real time clinical care of all patients with myocardial infarction, rather than the current situation in which hospitals employ audit staff to retrospectively enter records on patients in coronary care units. This needs to be dealt with to obtain a more complete understanding of the quality of care provided to patients with acute coronary syndromes.

Future research

Several lines of research are warranted by our findings. Firstly, research is required to understand how electronic health record data are coded—historically under-resourced and lacking audit against quality standards—and how this can be improved. Secondly, more extensive cross referencing is required against additional sources of information on myocardial infarction. These include self reported myocardial infarction (which may be less dependent on specific setting in the health system), manual review of all the available local case records (paper and electronic), and investigation of electronic free text recorded by general practitioners (for example, diagnoses that are not recorded using a Read code). Such efforts are underway in the UK Biobank cohort (n=500 000).3 There is a need for investigator led cohorts and trials to link with the primary care record.58 Although cancer registries do not record gold standard diagnostic criteria or cancer stage, it will be important to understand how linkages with primary care, admission to hospital, and mortality data compare.59 This is essential for large studies where manual review of case records is not feasible. Evaluating the quality of the data available in these linked data sources is therefore a priority.

Conclusion

Failure to use linked electronic health records from primary care, hospital care, disease registry, and death certificates may lead to biased estimates of the incidence and outcome of myocardial infarction.

What is already known on this topic

  • Electronic health records are increasingly used to measure outcomes of healthcare and health policy, and for research in observational cohorts and randomised trials

  • Records from one part of the health system, such as primary care, may not capture health events occurring in other parts of the health system, such as hospital care

  • No studies have addressed the completeness and validity of recording of myocardial infarction across four national health record sources: primary care, hospital care, disease registry, and death records

What this study adds

  • About one third of patients had a record of non-fatal acute myocardial infarction in all three of primary care, hospital care, and disease registry and two thirds in two sources

  • Risk factor profiles and one year all cause mortality were comparable across myocardial infarction records from different sources

  • Crude incidence of acute myocardial infarction was underestimated by 25-50% using one source compared with using all three sources

Notes

Cite this as: BMJ 2013;346:f2350

Footnotes

  • We acknowledge the support of Barts and the London Cardiovascular Biomedical Research Unit, which is funded by the National Institute for Health Research. The views expressed in this paper are those of the authors and do not reflect the official policy or position of the Medicines and Healthcare products Regulatory Agency.

  • Contributors: EH and ADS contributed equally to the design and analysis of the study and writing the manuscript. All authors were involved in study design and reviewing draft manuscripts. HH is guarantor.

  • Funding: This work was supported by grants from the UK National Institute for Health Research (grant No RP-PG-0407-10314), the Wellcome Trust (086091/Z/08/Z), and UK Biobank. LS is supported by a senior clinical fellowship from the Wellcome Trust (098504). EH is supported by a Medical Research Council studentship. AS is supported by a clinical research training fellowship from the Wellcome Trust (0938/30/Z/10/Z). Clinical Practice Research Datalink is owned by the UK Department of Health and operates within the Medicines and Healthcare products Regulatory Agency. Clinical Practice Research Datalink has received funding from the Healthcare products Regulatory Agency, Wellcome Trust, Medical Research Council, National Institute for Health Research health technology assessment programme, Innovative Medicine Initiative, UK Department of Health, Technology Strategy Board, Seventh Framework Programme EU, various universities, contract research organisations, and pharmaceutical companies. The department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute for Pharmaceutical Sciences has received unrestricted funding for pharmacoepidemiological research from GlaxoSmithKline, Novo Nordisk, the private-public funded Top Institute Pharma (www.tipharma.nl, includes co-funding from universities, government, and industry), the Dutch Medicines Evaluation Board, and the Dutch Ministry of Health. The funding sources had no role in the design and conduct of the study, analysis, interpretation, preparation, review, and approval of the manuscript; or the decision to submit the paper for publication. The views expressed in this paper are those of the authors and not necessarily those of the NHS, the National Institute for Health Research, or the Department of Health.

  • Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: financial support for the submitted work from the Wellcome Trust, National Institute for Health Research, Medicines and Healthcare products Regulatory Agency, Medical Research Council, and UK Biobank for the submitted work; no relationships that might have an interest in the submitted work in the previous three years; their spouses, partners, or children have no financial relationships that may be relevant to the submitted work; and no non-financial interests that may be relevant to the submitted work.

  • Ethical approval: This study was approved by the Medicines and Healthcare products Regulatory Agency’s independent scientific advisory committee (protocol 11 088), the MINAP Academic Group, and the CALIBER Scientific Oversight Committee.

  • Data sharing: No additional data available.

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/.

References