Re: Scientific misconduct is worryingly prevalent in the UK, shows BMJ survey
In January 1977, the British Medical Journal’s then-editor, Dr Stephen Lock, took on the chin , and subsequently acted on, a critique by Gore (my maiden name), Jones and Rytter on “Misuse of statistical methods: critical assessment of articles in BMJ from January to March, 1976” .
In January 2012, I find myself – a non-respondent in BMJ-editor Dr Fiona Godlee’s headline-making survey – again a critic, but this time of the BMJ’s own statistical methods, namely: of the design, analysis and reporting of their misconduct survey. Scientific misconduct is anti-science, deserves to be treated seriously, and requires paying serious attention to any survey-design for its quantification.
1. Response rate by the surveyed UK authors/reviewers was low at 2782/9036 (31%). Does BMJ ordinarily accept for publication surveys with response-rates so much lower than 60% as their own survey achieved?
2. Three questions were asked, one of which was to ascertain whether respondent was primarily clinician, academic or both. However, the information on type of respondent could not be used for covariate adjustment (that is: to determine if report-rate on misconduct differed by type of respondent) because the free software that had been used for the survey apparently did not allow access to respondent-type. Thus, BMJ has collected data that it cannot disclose . . .
3. Question 2 asked if the respondent was aware of ‘any cases of possible research misconduct at your institution that, in your view, have not been properly investigated?’. The balancing question – awareness of ‘any cases of possible research conduct at your institution that, in your view, have been properly investigated?’ – was not asked. Answers to the balancing question could potentially have been cross-checked institutionally. Survey design requires objectivity and balance in how questions are selected.
4. Question 2 elicited 163 affirmations of “cases”-awareness. As BMJ slides acknowledge, there may have been multiple reporting of the same institutional ‘cause celebre’; or respondents may have had awareness of more than one case in his/her institution. The phrasing of survey questions matters intensely. Because of poor analytical and reporting standards, we cannot be sure that the 163 affirmations do not emanate from a single institution or relate to a single case.
5. Time-related information can be essential for interpretability of surveys. In general, the longer a scientist’s career in a particular institution, the longer the period over which the scientist may have observed research misconduct there. Other things being equal, a scientist whose career at institution A has lasted 20 years should be 10 times more likely to have encountered research misconduct at institution A than a fellow-scientist whose career length at institution A is 2 years. Well-designed surveys build-in checks such as this which assure data-quality and because an analysis plan has been thought out in advance.
6. Analytical forethought immediately points to the necessity of knowing the respondent’s career-length in their current institution as well as the number of different “cases” of misconduct that s/he is aware of during the institutional-period so that “number of misconduct-cases per 1,000 science-career-years” can be computed.
7. Worst of all was question 1. Unlike question 2, question 1 is not restricted to one’s own current institution and so encompasses, for example, any nefarious analytical adjustment that I, as a statistician, may detect – and seek to sort out – as a referee. That being so, it is amazing, indeed incredible, that the affirmative rate for question 1 is as low as 13% (versus 6% affirmative rate for question 2). Did most respondents emanate from somewhat troubled institutions or are they less critical as peer-reviewers than as institutional-observers?
8. But, hold on, the different phrasing between question 1 (Have you witnessed, or do you have firsthand knowledge of, UK-based scientists inappropriately doing Y . . . ) and question 2 (Are you aware of any cases of X . . . at your institution) probably defies the sort of comparison that I sought to draw above. Of course, the survey should have been decently designed so that some form of comparison could be made, for example by asking if the most recently witnessed/firsthand knowledge behaviour had occurred in the respondent’s own institution . . .
9. Back to the wording of question 1: the witnessed/firsthand knowledge behaviours which the BMJ asked about range from inappropriately i) adjusting, ii) excluding, iii) altering data to iv) fabricating data either a) during their research or b) for the purposes of publication. Question 1 is, in effect, a set of eight questions!
10. If there is any consulting statistician who has not helped scientist-colleagues to do some regression analysis (ie statistical adjustment) more appropriately, I’d be concerned about their inexperience. ‘Inappropriately adjusting’, as stated in question 1, does not specifically rule-out - as a qualifying behaviour - the incorrect application of regression methods, which is part of the job of statistician-referees to detect and correct and which are more often errors of comprehension rather malice of forethought.
Just as the BMJ’s errors in survey-design, analysis and reporting are due to lack, not malice, of forethought. However, the first sentence in Tavare’s BMJ news piece is mischievously wrong – he omits to mention that question 1 asked about behaviours i) and ii) as well as about iii) and iv). Correction in the second paragraph is too late . . . a headline has been achieved by dint of ii) (if I’m generous) or iii) (if I’m sceptical).
As Dr Godlee remarks in another context: UK science and medicine deserve better. And statistical science requires better for the design, analysis and reporting of surveys. Designers of surveys with response rate as low as BMJ’s should look to their laurels; and not use statistics as the drunk uses a lamp-post: for support rather than illumination.
1. Editorial. Statistical errors. British Medical Journal 1977; 1: 66.
2. Gore SM, Jones IG, RytterEC. Misuse of statistical methods: critical assessment of articles in BMJ from January to March, 1976. British Medical Journal 1977; 1: 85-87.
3. Tavare A. Scientific misconduct is worryingly prevalent in the UK, shows BMJ survey. British Medical Journal 2012; 344: e377.
I have a long-standing interest in misuse of statistics. This post also appears on Straight Statistics with cross-referencing to BMJ Rapid Responses.
Competing interests: No competing interests