A new framework for developing and evaluating complex interventions: update of Medical Research Council guidance
BMJ 2021; 374 doi: https://doi.org/10.1136/bmj.n2061 (Published 30 September 2021) Cite this as: BMJ 2021;374:n2061All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
Dear Editor,
How we make decisions in the face of path-goal multiplicity, interdependence and heterogeneity is a serious problem. There is much that is welcome in the new MRC Framework,[1] but it overlooks many relevant methods that have grown in popularity since 2008 (instrumental variables estimation, qualitative comparative analysis, multi-criteria decision analysis, etc). Meanwhile, it gives undue weight to doctrines that offer questionable value and no predictive insights.
The Framework asserts that “improving theories and understanding how interventions contribute to change” is as important a goal in evaluation as “unbiased estimates of effectiveness”. Michael Scriven established evaluation as a transdiscipline, debunked positivism and promoted context analysis as a mode of explanation. But Scriven dismissed the idea that evaluations should begin with, and be based on, an account of how interventions work: “it’s based on a simple confusion of the aim of evaluation with the aim of explanation”.[2] Evaluation – in ordinary language terms – is “the process of determining merit, worth, or significance… the kind and extent of the benefits or damage wrought”[2] Explanation is neither necessary nor sufficient for, nor even substantially contributory to, evaluation.
‘Successful’ theories often turn out to be false (pessimistic induction); multiple theories may be consistent with observations (underdetermination of theory by evidence). These problems present challenges for scientific realism (the claim that we know which are the best theories and that they are approximately true); they also undermine the MRC Framework's appeals to the efficiency of theory use and its “impact on health outcomes”. So does the empirical evidence. In Dalgetty’s review of nine meta-analyses, eight found no difference in effectiveness between theory- and non-theory-based interventions.[3] The Framework cites the EPOCH study, omitting to state that its programme theory resulted in no health gains and longer inpatient length of stay.
Disconfirming EPOCH's programme theory led to a useful accumulation of generalisable knowledge; but most theories are not subjected to such severe tests. Kurt Lewin’s dogma (‘there is nothing as practical as a good theory’) has led to the uncontrolled proliferation of theories, which “has not advanced the field”.[4] Like toothbrushes, everyone has a theory, but no one wants to use anyone else’s. Blaming someone's application for the poor performance of a theory (which undermines Lewin’s appeal to practicality) or denying that comparative effectiveness research can speak to the performance of a theory (as CAM therapists sometimes do), marks out degenerating research programmes. Without severe tests of predictive power, accumulated evidence for boundary conditions or comparison with other theories, we are not ‘disputatious truth-seekers’.
With negligible and negligent theorising comes a largely redundant clutter of ontological furniture. Context is infinitely extendible, as are open systems. Mechanisms, it is said, are theories. Sometimes we can’t distinguish between contexts and mechanisms[5] (presumably ‘context matters’ because it is full of mechanisms). There are divergences between different definitions of ‘theory’[6], ‘context’[7], ‘mechanisms’[8], and ‘systems’[9], which conceal family resemblances underlying supposedly discrete constructs: they are all concerned with describing or explaining the influence of relationships and interactions on outcomes. Using them all together obscures more than it clarifies. Systems are, not facts, but cognitive frames, or conceptual lenses, which have advantages and disadvantages relative to alternative constructs, such as networks. Complex adaptive systems approaches cannot make predictions, and their value for intervention research is, at best, unclear.[10] We scoff at the idea that themes emerge from interview transcripts; that properties should 'emerge' from systems is surely also hypostasis.[11]
The Framework recommends a 'pluralist perspective' but redefines evaluation and dictates how we approach theory. It presents ‘theory as imagined’, rather than ‘theory as done’. Many applied health researchers have their own commitments to probability theory, to well-established theories of cognitive biases and to physiological theories[12,13]. Ex-post theorising is respectable in the social sciences[14], as is theoretical agnosticism[2,15], which is already effectively advocated by the MRC when it endorses co-production[16]. Some researchers would make everyone else think and theorise as they do. The costs of this ambition should not be imposed on those of us with different commitments.
1 Skivington K, Matthews L, Simpson SA, et al. A new framework for developing and evaluating complex interventions: update of Medical Research Council guidance. Bmj 2021;:n2061. doi:10.1136/bmj.n2061
2 Scriven M. The Logic of Evaluation. In: Hansen HV, ed. Dissensus and the Search for Common Ground. Windsor, ON: : OSSA 2007. 1–16.
3 Dalgetty R, Miller CB, Dombrowski SU. Examining the theory-effectiveness hypothesis: A systematic review of systematic reviews. Br J Health Psychol 2019;24:334–356. doi:10.1111/bjhp.12356
4 Wensing M, Grol R. Knowledge translation in health: How implementation science could contribute more. BMC Med 2019;17:1–6. doi:10.1186/s12916-019-1322-9
5 Marchal B, van Belle S, van Olmen J, et al. Is realist evaluation keeping its promise? A review of published empirical studies in the field of health systems research. Evaluation 2012;18:192–212. doi:10.1177/1356389012442444
6 Abend G. The Meaning of ‘Theory’. Sociol Theory 2008;26:173–99. doi:10.1111/j.1467-9558.2008.00324.x
7 Squires JE, Graham ID, Hutchinson AM, et al. Understanding context in knowledge translation: a concept analysis study protocol. J Adv Nurs 2015;71:1146–55. doi:10.1111/jan.12574
8 Knight CR, Reed IA. Meaning and Modularity: The Multivalence of “Mechanism” in Sociological Explanation. Sociol Theory 2019;37:234–56. doi:10.1177/0735275119869969
9 Carvajal R. Systemic-Netfields: The Systems’ Paradigm Crisis. Part I. Hum Relations 1983;36:227–45. doi:10.1177/001872678303600302
10 Brainard J, Hunter PR. Do complexity-informed health interventions work? A scoping review. Implement Sci 2015;11:127. doi:10.1186/s13012-016-0492-5
11 Reid I. Let them eat complexity: the emperor’s new toolkit. BMJ 2002;324:171–171. doi:10.1136/bmj.324.7330.171
12 Boxell EG, Malik Y, Wong J, et al. Are treatment effects consistent with hypothesized mechanisms of action proposed for postoperative delirium interventions? Reanalysis of systematic reviews. J Comp Eff Res Published Online First: 29 September 2021. doi:10.2217/cer-2021-0161
13. Dixon-woods M. The Problem of Context in Quality Improvement. London: The Health Foundation; 2014. http://www.health.org.uk/publications/perspectives-on-context/.
14 Dixon-Woods M, Bosk CL, Aveling EL, et al. Explaining Michigan: developing an ex post theory of a quality improvement program. Milbank Q 2011;89:167–205. doi:10.1111/j.1468-0009.2011.00625.x
15 Timmermans S, Tavory I. Theory Construction in Qualitative Research: From Grounded Theory to Abductive Analysis. Sociol Theory 2012;30:167–86. doi:10.1177/0735275112457914
16 O’Cathain A, Croot L, Duncan E, et al. Guidance on how to develop complex interventions to improve health and healthcare. BMJ Open 2019;9:1–9. doi:10.1136/bmjopen-2019-029954
Competing interests: I write grant applications and conduct pragmatic research for a living, and have been in receipt of several NIHR and MRC grants for the delivery of randomised controlled trials and theoretically-informed mixed-methods studies. Any rivalry is purely philosophical. I'm a Peircian pragmatist, which makes me a falsificationist and more of a deflationary realist than is fashionable in some quarters. Peirce's first rule of reason was "in order to learn you must desire to learn, and in so desiring not be satisfied with what you already incline to think" (no confirmation bias). His famous corollary to this rule was: "Do Not Block the Way of Inquiry", which means that no question is illegitimate. The attitudes I criticise in this paper belong to research programmes which have been openly qualified in their pluralism and - as I would see it - insufficiently reflexive about their own cherished beliefs. At the heart of our differences are tensions between a vision of evaluation devoted to explanation and one devoted to value. I'm nonetheless, an admirer of their leaders and members - their intellects, motivations and humanity.
Re: A new framework for developing and evaluating complex interventions: update of Medical Research Council guidance
Dear Editor
The new MRC guidance brings a touch of heroism to lumbering literature on healthcare research. On offer is an almighty methodological shove in favour of pluralism in the evaluation of interventions. The ‘hierarchy of evidence’ is consigned to history and research perspectives that once seemed marginal are now provided with welcome, authoritative endorsement. Healthcare improvement is better supported when underpinned by a comprehensive evidence base, which combines: 1. Efficacy research to investigate whether interventions produce significant effects in experimental settings; 2. Effectiveness research to investigate whether intended outcomes are generated in real world conditions; 3. Theory-based research to explain what works for whom in which circumstances and how: 4. Systems research to understand how the intervention and the wider system adapt to one another.
I join the celebration of this prospectus, so it may be a mite churlish to point to a little local difficulty. Historically, the four perspectives are antagonistic; each one, as we move down the list, being created as a critical response to the former. The point of friction turns on the matter of applicability – where do research findings apply? Efficacy RCTs are mounted with strict controls on target group, practitioner eligibility, implementation schedule, institutional setting and so on. All of these features vary significantly in complex, real world interventions, in which the findings of efficacy trials cannot and do not apply. The methodological remedy is said to reside in effectiveness or pragmatic trials (PRCTs), which mount the investigation in ‘typical conditions’. A key lesson of the new complexity framework, however, is that interventions will always adapt and transform in different contexts. This means that the so-called typical conditions are a mere subset of all potential applications. Accordingly, the results of a particular PRCT apply only in the fixed and narrow permutation of contexts encountered in that trial.
Theory-based evaluation switches the unit of analysis from ‘programmes’ to ‘programme theories’ under the notion that interventions are never reproducible, but these underlying conjectures are the source of communality and the axis of learning. We learn why and in what circumstances the programme theory may come to fruition. But once again complexity theory tells us that there is never a singular programme theory in command. Researchers will encounter a burgeoning, emergent bundle of theories as they think through of the ideas of all stakeholders. Consequently, theory-based evaluation must prioritise and acknowledge that it is prioritising the investigation of a limited bundle of working hypotheses.
This brings us to the applicability of systems research. It is rooted in complexity theory, which is a metatheory, an ontology describing the entities that arise from social interaction. Accordingly, it is rather better at telling us what is omitted from an inquiry rather than how to conduct one. It does not possess an associated method of empirical investigation. The MRC paper provides some examples of what it calls ‘events in systems’ research. But interventions are not events. By the lights of complexity theory they are ‘open systems’ – they constantly exchange information with their surroundings and contexts. Just like all of its predecessors, systems research in its empirical applications is irredeemably partial. Its substantive findings relate to the action of specific sub-systems within particular sub-systems. They await revision under the next ‘unknown unknown’, the next ‘beat of the butterfly wing’.
What follows? Having let the complexity genie out of the bottle, it cannot be recorked. All forms of evaluation research are partial, fallible, revisable, and bounded. These boundaries clash. The scope conditions of each paradigm remain incommensurable; no empirical project can capture and unite the full amalgam. Thanks to the new MRC framework, intervention research will now traverse many more boundaries – but the grand challenge of evidence-based policy is also enlarged. It has always proved taxing to match research methods to the intellectual spaces occupied by policymakers and practitioners. Now they will need to be persuaded to find their way to the appropriate paradigm only to discover that the advice transmitted is partial, fallible, revisable, and bounded. Welcome to the brave, new frontier of knowledge transfer!
Competing interests: No competing interests