Written informed consent and selection bias in observational studies using medical records: systematic reviewBMJ 2009; 338 doi: https://doi.org/10.1136/bmj.b866 (Published 12 March 2009) Cite this as: BMJ 2009;338:b866
- Michelle E Kho, registered physical therapist and PhD student1,
- Mark Duffett, clinical pharmacist and assistant professor2,
- Donald J Willison, associate professor1,
- Deborah J Cook, practising intensivist, clinical trialist, and professor13,
- Melissa C Brouwers, associate professor1, provincial director4
- 1Department of Clinical Epidemiology and Biostatistics, McMaster University, 1280 Main Street West, Hamilton, ON, Canada L8S 4L8
- 2Department of Critical Care, McMaster Children’s Hospital, Hamilton, ON, Canada
- 3Department of Medicine, McMaster University, Hamilton, ON, Canada
- 4Program in Evidence-based Care, Cancer Care Ontario, McMaster University, Canada
- Correspondence to: M E Kho
- Accepted 2 December 2008
Objectives To determine whether informed consent introduces selection bias in prospective observational studies using data from medical records, and consent rates for such studies.
Design Systematic review.
Data sources Embase, Medline, and the Cochrane Library up to March 2008, reference lists from pertinent articles, and searches of electronic citations.
Study selection Prospective observational studies reporting characteristics of participants and non-participants approached for informed consent to use their medical records. Studies were selected independently in duplicate; a third reviewer resolved disagreements.
Data extraction Age, sex, race, education, income, or health status of participants and non-participants, the participation rate in each study, and susceptibility of these calculations to threats of selection and reporting bias.
Results Of 1650 citations 17 unique studies met inclusion criteria and had analysable data. Across all outcomes there were differences between participants and non-participants; however, there was a lack of consistency in the direction and the magnitude of effect. Of 161 604 eligible patients, 66.9% consented to use of data from their medical records.
Conclusions Significant differences between participants and non-participants may threaten the validity of results from observational studies that require consent for use of data from medical records. To ensure that legislation on privacy does not unduly bias observational studies using medical records, thoughtful decision making by research ethics boards on the need for mandatory consent is necessary.
Information from review of medical charts is often used to carry out audits, perform non-interventional observational studies, create disease registries, and do other types of health services research. Informed consent is not always necessary for these types of research, which involve abstraction of data from patients’ records. Many such studies do not influence practice or patients’ outcomes and therefore confer no risk and no benefit to participants. That notwithstanding, recent legislation to protect the privacy and confidentiality of patients’ information in medical research introduced in many jurisdictions (for example, the regulations to the Health Insurance Portability and Accountability Act in the United States) has resulted in increased requests from research ethics boards to obtain informed consent to use data from medical records for such observational studies.1 As early as 1977 concerns were voiced about the possible negative impact of privacy laws on epidemiological research.2 More recently, editorial reviews highlighted the negative impact of mandatory informed consent on observational research through conservative interpretation of privacy legislation.3 4 5
As with many other aspects of research, requirements for informed consent to use data from medical records vary across research ethics boards within and among countries. For example, in a multisite study involving a review of children’s charts who presented to emergency departments with bronchiolitis, 34 research ethics boards arrived at divergent requirements for consent at their institutions, ranging from none to mandatory written consent.6 Four of the invited 34 sites did not participate owing to the investigator perceived hurdles with research ethics boards pertaining to informed consent.
Of greater concern is the impact of informed consent on the validity of the research in observational studies, audits, or registries. Mandatory informed consent in such no risk or low risk studies can create challenges to implementation and biased results. For example, in the Canadian Stroke Registry, investigators identified important differences between participants and non-participants in prognostic characteristics.w1 The selection bias introduced by informed consent was sufficiently serious to jeopardise the overall validity of the study, and investigators effectively shut down the registry by discontinuing follow-up surveys and record linkage studies.w1 Furthermore, case studies documenting the challenges of implementing informed consent recently reported low consent rates and poor efficiency of recruitment.w1 w2
The primary objective of our systematic review was to determine whether informed consent for use of data from medical records introduces selection bias by examining differences in key personal characteristics between participants and non-participants in prospective observational studies requiring informed consent for access to medical records. Our secondary objective was to determine the rates of consent in these studies.
We sought all studies reporting characteristics of participants and non-participants approached for informed consent to use data from their medical records for prospective observational studies or registries. We included studies reporting at least one of the following characteristics: age, sex, race, education, income, or health status. We also included studies that requested consent for access to medical records in addition to self administered or interview administered surveys or biological samples. However, we excluded studies of interventions (for example, randomised controlled trials) and studies using self administered or interviewer administered surveys or biological samples (for example, biobanks) alone. Owing to limitations on resources, we included only English language studies.
After consultation with a librarian in health sciences, we searched Embase (1980 to week 13 2008), Medline (1966 to March week 3 2008), and the Cochrane Library (Issue 1, 2008) (see web extra appendix A for full search strategy). To identify further articles from each included study, we searched reference lists, used the PubMed “related articles” feature, carried out a search of cited references in Thompson Scientific (Web of Science), and used the Google Scholar “cited by” feature.
Independently and in duplicate (MEK, MD) we scanned citations first by title and then by abstract. We reviewed full reports of all potentially relevant abstracts and calculated inter-rater reliability for included studies using the κ statistic. We subsequently resolved all disagreements through consensus; an independent adjudicator (MCB) resolved outstanding disagreements. Study population and setting, disease status, and recruitment methods were extracted.
We calculated the participation rate in each study7 and assessed the susceptibility of calculations on participation rate against threats of selection and reporting bias.8 For each study we determined the number of eligible participants, number approached for consent, number who responded to the request for consent, number of active consents, and number of active declines.
Heterogeneity among the studies in study design, recruitment methods, requests for consent, populations enrolled, and research settings precluded quantitative synthesis of the data.9 10 We used RevMan v 4.2.8 (Cochrane Collaboration) to calculate odds ratios (binary data), weighted mean differences (continuous data), and 95% confidence intervals. We used the χ2 statistic for comparisons of nominal data (>2 categories)11 using SPSS version 16.0.
The electronic search identified 1650 citations, of which 128 were duplicates and 1335 were excluded after review of the title or abstract. Of 187 publications reviewed in full, 24 representing 23 unique studies met the eligibility criteria.w1-w24 The inter-rater reliability for included studies was 0.84 (95% confidence interval 0.83 to 0.86). Of the 23 eligible studies, 17 reported sufficient information for analyses of participants and non-participants and form the basis of this review.w1 w2 w4 w5 w7 w8 w10-w13 w15 w16 w18-w21 w23 w24 The figure⇓ outlines the flow of included studies12 and table 1⇓ summarises the characteristics of the studies.
Participation rates and susceptibility to bias
All 17 studies described eligibility criteria. Three disclosed no information on how investigators identified eligible participants for informed consent.w15 w19 w23 Four approached all eligible participants and two randomly selected eligible participants.w12 w21 In five studies, investigators were prevented from approaching all potentially eligible participants owing to physician approval,w1 w2 w5 patient availability,w1 and study specific barriers.w13 All but one study presented sufficient information to reconstruct the outcomes of participation (table 1).w15
Of 161 604 eligible patients in the 17 studies, 108 033 (66.9%, 95% confidence interval 66.6% to 67.1%) provided active consent for use of data from their medical records. Consent rates for eligible participants varied across the studies (from 36.6%w21 to 92.9%w20) and approximated a normal distribution (not shown). Table 2⇓ outlines the methodological information related to obtaining consent and table 3⇓ outlines the rates of participation in each study.
Differences between participants and non-participants
Authors represented the characteristics of participants and non-participants in four different ways: continuous data,w1 w2 w8 w11 w21 proportions,w1 w2 w4 w5 w7 w10 w11 w19 w21 w23 w24 regression analyses,w13 and the weighted proportion of patients declining consent after adjustment for study design.w12 w20 Studies reported comparisons between participants and non-participants with different denominators: four studies reported consent of those eligible,w1 w2 w16 w23 eight of those approached,w4 w7 w8 w12 w19-w21 w24 and four of those who responded.w5 w10 w13 w15 One study reported the denominator for consent on the basis of the availability of personal characteristics.w11 We describe comparisons between participants and non-participants according to age, sex, race, income, education, and health status. Table 4⇓ summarises differences by these outcomes.
Age—Sixteen studies reported characteristics of the participants and non-participants by age (see web extra appendix B1).w1 w2 w4 w5 w7 w8 w10-w13 w16 w19-w21 w23 w24 Seven studies found no age related differences,w2 w5 w8 w10 w13 w16 w21 one found that participants were younger than non-participants,w19 and seven identified significant differences across age strata; however, no clear pattern emerged.w4 w7 w11 w12 w20 w23 w24 In the Canadian Stroke Registry Network, participants in phase I of the study were younger than non-participants, whereas in phase II there were no differencesw1 after a change in recruitment strategy.
Sex—Fourteen studies that recruited both males and females reported the characteristics of participants and non-participants by sex (see web extra appendix B2).w1 w2 w4 w5 w7 w10-w13 w15 w16 w19 w21 w23 Six studies reported no differences in the odds of females participating compared with males.w2 w10 w12 w13 w16 w23 In the six studies where there were differences, two determined that females were more likely to consent than males,w4 w7 whereas four determined that females were less likely to consent than males.w5 w11 w19 w21 In the remaining two studies the participation of females differed between subgroups. In the Takashima cohort study there was no difference in the likelihood of participation between females and males enrolled in a group requesting access to their medical records in addition to surveys and a blood sample. However, females were less likely to participate than males with the addition of genetic testing to the request for access to medical records, surveys, and a blood sample.w15 The Canadian Stroke Registry Network initially had no differences in participation rates between the sexes; after a change in recruitment strategy fewer females than males participated.w1
Race—Six studies reported the characteristics of participants and non-participants by race (see web extra appendix B3).w1 w10 w11 w18 w20 w21 Two found no difference in the odds of obtaining consent by race.w10 w18 Three studies determined higher participation rates in white or Caucasian patients than others,w1 w20 w21 and a Taiwanese study of national health records identified differences across four strata of race.w11
Income—Seven publications reported participants and non-participants by income (see web extra appendix B4).w5 w10 w11 w13 w16 w20 w21 Four studies found no differences by income.w5 w10 w16 w21 Across five strata of income, Huang et alw11 identified varying rates of participation for access to Taiwanese National Health Insurance records. Another study reported no differences in income in parents of babies in neonatal intensive care units, whereas there were differences across income categories in parents of healthy babies.w13 Women who never worked or who were unemployed long term were less likely to participate in the UK millennium cohort study; however, after adjusting for education, socioeconomic status did not independently predict participation.w20
Education—Six studies reported participants and non-participants by education (see web extra appendix B5).w11 w13 w19-w21 w24 Two studies found no differences related to educationw13 w21; however, in the Australian Longitudinal Study on Women’s Health, those women who had continued their education beyond school were more likely to participate. Three studies that described participants and non-participants by strata identified significant differences, although no clear patterns emerged.w11 w19 w20
Health status—All six studies that reported health status found differences between participants and non-participants (see web extra appendix B6).w2 w10-w12 w16 w21 In two studies participants had more disability or comorbidity than non-participants as measured by the Charlson comorbidity indexw12 and the physical components summary.w21 Two studies reported that participants had less disability than non-participants as measured by the modified Rankin scorew2 and disability score.w10 One study reported higher SF-36 subscale scores in physical function, role physical, vitality, and general health in participants and no differences in role emotional, social functioning, bodily pain, and mental health.w11 In a study that enrolled patients from the paediatric intensive care unit, participation varied by strata for risk of death.w16
Bias results in systematic deviation from the underlying truth.8 Jacobsen et al used the term authorisation bias to describe statistically significant differences between participants and non-participants in research that used medical records.w12 In this systematic review we identified 17 unique studies comparing participants and non-participants in observational studies that requested access to medical records. Across all outcomes there were differences between participants and non-participants, although there was a lack of consistency in the direction and the magnitude of effect. Thus although results of this systematic review suggest that requirements for informed consent introduced a variety of biases into prospective observational studies using data from medical records, no systematic deviations occurred and the cause of the differences by age, sex, race, income, education, or health status that did emerge is unclear. Most studies did not explore reasons for refusal, non-response, or inability to contact patients. This is an important gap, as failure to ask for consent may indicate deficiencies in organisational planning that call for a different policy response than does refusal to participate. At this stage the state of the research is such that our ability to predict these differences with confidence and to guide researchers to avoid authorisation bias is limited.
In terms of our secondary objective, participation rates varied substantially. Studies with high participation rates showed selection biases, the proportion of eligible participants approached for enrolment differed across studies and we identified opportunities to improve the reporting of outcomes for consent. Whereas all studies reported how investigators identified eligible people, four did not report how those eligible were chosen for participation.w15 w16 w19 w23 Knowing such information helps us to better interpret how susceptible these four studies were to selection bias before the introduction of informed consent.
Our review indicates that consent rates for studies using medical records vary considerably, affecting recruitment efforts and potentially influencing study results. Accordingly, consideration of these factors in the study design, planning, and budgeting is essential. Willison et al offered practical advice for studies based on consent at local and systems levels after their experiences of involving multisites in the Canadian Stroke Registry Network, such as testing the consent process by using a pilot, close communication with research ethics boards and healthcare institutions, consideration of random sampling strategies, and ongoing monitoring and feedback on accrual.14 Recent recommendations reinforce the explicit reporting of personal comparisons between participants and non-participants as an important feature of publications on observational studies.15 Future research needs to systematically study why otherwise eligible patients are not approached for consent and the characteristics of patients associated with refusal to participate in studies using medical records.
Pragmatically what should researchers planning a prospective observational study that involves medical records do? The United Kingdom National Health Service Act 2006 (Section 251),16 the common rule of the Health Insurance Portability and Accountability Act,17 and the Canadian Institutes for Health Research18 offer guidance to researchers on informed consent for research involving medical records. Although consent is required for the collection of personal information from participants for medical procedures, medical examinations, and clinical trials, exemption from requiring consent may be appropriate for studies using medical records owing to impracticability of informed consent and the possibility of introducing biased study results.18
We suggest requesting a waiver of consent from the research ethics board for research using medical records because these studies confer no or minimal risk, do not directly benefit the patient, and because of the potential biases introduced through loss of data in ways that are not completely at random. We suggest explicitly outlining the procedures the research team will take to protect the privacy and confidentiality of each patient. For example, to minimise the risks of a breach of confidentiality, researchers could collect the minimum personal information necessary for identification from each record and incorporate strict access policies to the data at patient level.
However, if a waiver of consent is not possible, as in some European Union jurisdictions,19 we suggest collecting a minimum dataset of key prognostic variables on all eligible people for the study identified through screening. These data can be used to carry out a preliminary analysis comparing participants with non-participants on the key prognostic variables at predetermined times during study accrual, taking into account statistical adjustments for multiple significance testing. Such an approach may lead to revised recruitment strategies to address these concerns—for example, tailored recruitment, targeting participation of populations less likely to grant consent.
On the basis of findings from this review, the validity of results from observational studies that require consent for access to medical records may be threatened as a result of significant differences between participants and non-participants. Across the continuum of research we suggest three strategies to minimise the impact of authorisation bias at the inception, reporting, and interpretation of research. At inception we suggest widespread education aimed at clinicians, researchers, and research ethics boards on the conditions under which studies can proceed without individual consent. To help us better interpret differences between participants and non-participants we suggest standardised reporting of methods used to seek informed consent. We believe the elements we report in table 2 provide the minimum dataset for these purposes and could serve as the foundation for expectations on quality reporting. Similarly, we advocate standardised key metrics on informed consent such as participation rates, including eligible, approached, responded, active consent, and active declines (see table 3). Finally, in interpreting observational studies that exhibit significant differences between participants and non-participants, clinicians and researchers should be aware of differences in important prognostic variables and their possible impact on study results. The box summarises our recommended strategies to minimise the impact of bias from informed consent.
Five suggested strategies to minimise the impact of bias from informed consent
Request a waiver of consent from research ethics boards and explicitly outline procedures to protect the privacy and confidentiality of each patient
If a waiver is not possible then:
Collect a minimum dataset of key prognostic variables on all eligible people identified through screening
Complete a preliminary analysis comparing participants and non-participants on key prognostic variables at predetermined times
Revise the strategy for recruitment as necessary
Aim education at clinicians, researchers, and research ethics boards on conditions under which studies can proceed without individual consent
Standardise reporting of methods used to seek informed consent
Increase awareness by clinicians and researchers of the potential impact of selection bias introduced by informed consent and implications for interpretation of the study
Strengths and limitations of the study
Our review has several strengths. A priori we developed comprehensive search strategies with librarians in health sciences who were familiar with the indexing methods of electronic databases on health. We included both Medline and Embase, which are complementary bibliographic databases of the biomedical literature20; we supplemented included articles with searches of cited references and related articles. We used broad search strategies of published literature and reviewed all citations and included studies in duplicate that resulted in good agreement and transparent reporting.10 12
Our review also has limitations. Our search included only studies in English available in the peer review literature. Because of the variability in results we do not expect exclusion of non-English studies to impair the generalisability of our findings; however, this hypothesis needs to be confirmed in future research. Our review was limited by the published reports, including lack of clarity about the sample size and reporting standards for screening and consent procedures. Not all studies reported data on our outcomes of interest; authors may not have collected data on these outcomes or chose to report only significant differences between participants and non-participants.w7 w20 w24 For example, of the two studies reporting all six of our outcomes of interest, one identified statistically significant differences by sex, race, and health status and no differences by age, income, or education,w21 whereas the other study identified significant differences by age, race, income, education, and health status and no differences by sex.w11 Because these observational studies were not specifically designed to study differences in consent between participants and non-participants, we may have observed statistically significant differences across our outcomes of interest simply by chance.8
In conclusion, we observed authorisation bias in studies requiring informed consent for use of data from medical records. To assess better the impact of informed consent on prospective observational studies, consistent reporting of core personal factors of known prognostic significance between the characteristics of participants and non-participants is necessary. To ensure that legislation on privacy does not unduly threaten the validity of observational studies using data from medical records, education of bodies responsible for overseeing research and further investigations are urgently needed on the determinants and consequences of consent and non-consent for these studies.
What is already known on this topic
Privacy legislation has resulted in some research ethics boards requiring informed consent to use medical records
Whether mandatory informed consent creates selection bias in these observational studies is unknown
What this study adds
Of 1650 citations, 17 unique studies met inclusion criteria and had analysable data on the following six outcomes: age, sex, race, education, income, or health status
Across all outcomes, differences between participants and non-participants occurred, although there was a lack of consistency in the direction and the magnitude of effect
To ensure that legislation does not unduly bias observational studies using medical records, thoughtful decision making by research ethics boards on the need for mandatory consent is necessary
Cite this as: BMJ 2009;338:b866
Contributors: MEK conceived and designed the study and drafted the article. She had full access to all the data in the study, takes responsibility for the integrity of the data and the accuracy of the data analysis, and is guarantor. MEK and MD acquired the data; MEK, MD, and MCB analysed the data; and MEK, MD, DJC, DJW, and MCB interpreted the data. MD, DJC, DJW, and MCB critically revised the manuscript for important intellectual content. All authors approved the final version to be published.
MEK is funded by a fellowship from the Canadian Institutes of Health Research (Clinical Research Initiative). DJC is a research chair of the Canadian Institutes of Health Research. The Canadian Institutes of Health Research had no involvement in the design and conduct of the study; collection, management, analysis, and interpretation of the data; or preparation review, or approval of the manuscript.
Competing interests: None declared.
Ethical approval: Not required.
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.