Intended for healthcare professionals

CCBYNC Open access

Inter-rater agreement in evaluation of disability: systematic review of reproducibility studies

BMJ 2017; 356 doi: (Published 25 January 2017) Cite this as: BMJ 2017;356:j14
  1. Jürgen Barth, senior researcher1 2,
  2. Wout E L de Boer, senior researcher1,
  3. Jason W Busse, associate professor3 4 5,
  4. Jan L Hoving, senior researcher6 7,
  5. Sarah Kedzia, junior researcher1,
  6. Rachel Couban, librarian4,
  7. Katrin Fischer, professor8,
  8. David Y von Allmen, postdoctoral researcher1,
  9. Jerry Spanjer, senior researcher9 10,
  10. Regina Kunz, professor1
  1. 1Evidence-based Insurance Medicine (EbIM), Research and Education, Department Clinical Research, University Basel Hospital, University of Basel, Spitalstrasse 8 + 12, CH-4031 Basel, Switzerland
  2. 2Institute for Complementary and Integrative Medicine, University Hospital Zurich and University of Zurich, Zurich, Switzerland
  3. 3Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON L8S 4K1, Canada
  4. 4Department of Anaesthesia, McMaster University, Hamilton, ON L8S 4K1, Canada
  5. 5The Michael G DeGroote Institute for Pain Research and Care, McMaster University, Hamilton, ON L8S 4K1, Canada
  6. 6Coronel Institute of Occupational Health, Academic Medical Centre, University of Amsterdam, Amsterdam, Netherlands
  7. 7Research Centre for Insurance Medicine, AMC-UMCG-UWV-VUmc, Amsterdam, Netherlands
  8. 8School of Applied Psychology, Institute Humans in Complex Systems, Olten, Switzerland
  9. 9Dutch National Institute for Employee Benefits Schemes, Groningen, Netherlands
  10. 10Department of Health Sciences, Community and Occupational Medicine, University Medical Centre Groningen, Netherlands
  1. Correspondence to: R Kunz regina.kunz{at}
  • Accepted 21 December 2016


Objectives To explore agreement among healthcare professionals assessing eligibility for work disability benefits.

Design Systematic review and narrative synthesis of reproducibility studies.

Data sources Medline, Embase, and PsycINFO searched up to 16 March 2016, without language restrictions, and review of bibliographies of included studies.

Eligibility criteria Observational studies investigating reproducibility among healthcare professionals performing disability evaluations using a global rating of working capacity and reporting inter-rater reliability by a statistical measure or descriptively. Studies could be conducted in insurance settings, where decisions on ability to work include normative judgments based on legal considerations, or in research settings, where decisions on ability to work disregard normative considerations.Teams of paired reviewers identified eligible studies, appraised their methodological quality and generalisability, and abstracted results with pretested forms. As heterogeneity of research designs and findings impeded a quantitative analysis, a descriptive synthesis stratified by setting (insurance or research) was performed.

Results From 4562 references, 101 full text articles were reviewed. Of these, 16 studies conducted in an insurance setting and seven in a research setting, performed in 12 countries, met the inclusion criteria. Studies in the insurance setting were conducted with medical experts assessing claimants who were actual disability claimants or played by actors, hypothetical cases, or short written scenarios. Conditions were mental (n=6, 38%), musculoskeletal (n=4, 25%), or mixed (n=6, 38%). Applicability of findings from studies conducted in an insurance setting to real life evaluations ranged from generalisable (n=7, 44%) and probably generalisable (n=3, 19%) to probably not generalisable (n=6, 37%). Median inter-rater reliability among experts was 0.45 (range intraclass correlation coefficient 0.86 to κ−0.10). Inter-rater reliability was poor in six studies (37%) and excellent in only two (13%). This contrasts with studies conducted in the research setting, where the median inter-rater reliability was 0.76 (range 0.91-0.53), and 71% (5/7) studies achieved excellent inter-rater reliability. Reliability between assessing professionals was higher when the evaluation was guided by a standardised instrument (23 studies, P=0.006). No such association was detected for subjective or chronic health conditions or the studies’ generalisability to real world evaluation of disability (P=0.46, 0.45, and 0.65, respectively).

Conclusions Despite their common use and far reaching consequences for workers claiming disabling injury or illness, research on the reliability of medical evaluations of disability for work is limited and indicates high variation in judgments among assessing professionals. Standardising the evaluation process could improve reliability. Development and testing of instruments and structured approaches to improve reliability in evaluation of disability are urgently needed.


  • We thank Gordon Guyatt, McMaster University, Hamilton, Canada, for his input in conceptualising the review; Nozomi Takeshima, Health Promotion and Human Behavior, Kyoto University Graduate School of Medicine, Japan, and Rosella Saulle and Laura Amato, Department of Epidemiology, Lazio Regional Health Service, Rome (Italy), for extracting the data from the Japanese and Italian studies; and Sacha Röschard for administrative support.

  • Contributors: RK and WdB developed the idea. RK, JB, WdB, JWB, KF, and GG contributed substantially to to the conception and design. RK, JB, WdB, JWB, RC, SK, JS, DvA, and KF contributed to acquisition, analysis, or interpretation of the data. RK, JB, JWB, JH, KF, SK, RC, DvA, and JS drafted or revised the manuscript critically for important intellectual content, approved the final version to be published. RK and JB are guarantors.

  • Funding: This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

  • Competing interests: All authors have completed the ICMJE uniform disclosure form at (available on request from the corresponding author) and declare that JWB acts as a consultant to Prisma Health Canada, a private incorporated company funded by employers and insurers that consults on and manages long term disability claims. The Evidence-based Insurance Medicine Unit at the University Hospital in Basel is funded in part by donations from public insurance companies and a consortium of private insurance companies (RK). After the manuscript was finalised, RK took a part time position at the Swiss National Accident Insurance Fund, Suva. RK, JB, WdB, JWB, and JH were initiators of Cochrane Insurance Medicine.

  • Ethical approval: Not required.

  • Data sharing: No additional data available.

  • Transparency: The lead author affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned have been explained.

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

View Full Text