The problems raised by BMJ (1,2) are very important for medical
research community. Unfortunately, there is no simple solution to the
problem of fraudulent or tampered data. The paper by Al-Marzouki, Evans,
Marshall and Roberts (1) uses statistical test to compare variances for
baseline variables in two groups of randomized controlled trials (RCT).
Their implicit hypothesis (which seemed to be true) that if researcher
tampers with data he does not know that variance should be approximately
the same and generates all his data by hand.
Unfortunately only statistically illiterate researcher will do that,
especially for publication in high-impact journal. If researcher lacks
ethical constraint he will simply decide on necessary mean values, take
variance from published sources or small sample of patients then run
existing in all statistical packages function to generate, for example,
normally distributed data with given mean and variance. Then he can
project what result he would like to get and repeat process for made-up
follow-up data. He can even use non-normally distributed data, mixed
distribution, etc. There is almost no way to uncover such fraud, except
collecting on-site evidences that investigation was not performed. Such
investigation is legally difficult and very expensive, especially in
computer era, as most of data are accumulated in electronic - easy to
tamper with - form. All solutions (like registration of all controlled
trials with possibility of sudden on-site inspection, collection of data
with time-stamps in third place, etc.) will be very costly and eventually
will harm mostly innocent researchers due to rise in research
expenditures.
With future of statistical fraud control bleak, more discussion
should be directed to possibilities to decrease incentives connected with
fraud. Impact of research fraud would be lessened if consumers of
scientific information will remember about and demand reproducibility.
This, in turn, should influence decisions of Institutional Review Boards
(IRB) on repeating RCTs. It is now considered unethical to repeat RCT if
previous one showed one treatment superior. IRB in different institution
should allow repetition of RCT if controversial treatment was used in a
previous RCT or body of other knowledge does not support results of
previous RCT.
Unfortunately, possibility of scientific fraud is relatively high. In
a recent survey (3) 0.3% of US scientists funded by NIH confessed that
they had falsified or ‘cooked’ research data and almost every seventh
(15.3%) indicated that they dropped observations or data points from
analysis based on a gut feeling.
It is now probably the time to consider researcher as a source of
possible bias that is not controlled by use of randomization and use for
decision-making regarding RCT the same causation criteria of strength,
consistency, specificity, relationship in time, biological gradient,
biological plausibility, coherence of evidence, experiment and analogy
that were put forward by Sir Austin Bradford Hill and are standard for
assessing causation in non-randomized studies.
1. Al-Marzouki S, Evans S, Marshall T, Roberts I. Are these data
real? Statistical methods for the detection of data fabrication in
clinical trials BMJ 2005;331:267-270.
2. White C., Suspected research fraud: difficulties of getting at the
truth. BMJ 2005;331:281-288
3. Martinson BC, Anderson MS, de Vries R. Scientists behaving badly.
Nature 2005; 435; 737-738
Competing interests:
None declared
Competing interests:
No competing interests
05 August 2005
Sviatoslav L. Plavinski
Dean, College of Public Health
Medical Academy for Postgraduate Studies, 191015, St.Petersburg, Russia
Rapid Response:
Researcher as a source of uncontrollable bias
The problems raised by BMJ (1,2) are very important for medical
research community. Unfortunately, there is no simple solution to the
problem of fraudulent or tampered data. The paper by Al-Marzouki, Evans,
Marshall and Roberts (1) uses statistical test to compare variances for
baseline variables in two groups of randomized controlled trials (RCT).
Their implicit hypothesis (which seemed to be true) that if researcher
tampers with data he does not know that variance should be approximately
the same and generates all his data by hand.
Unfortunately only statistically illiterate researcher will do that,
especially for publication in high-impact journal. If researcher lacks
ethical constraint he will simply decide on necessary mean values, take
variance from published sources or small sample of patients then run
existing in all statistical packages function to generate, for example,
normally distributed data with given mean and variance. Then he can
project what result he would like to get and repeat process for made-up
follow-up data. He can even use non-normally distributed data, mixed
distribution, etc. There is almost no way to uncover such fraud, except
collecting on-site evidences that investigation was not performed. Such
investigation is legally difficult and very expensive, especially in
computer era, as most of data are accumulated in electronic - easy to
tamper with - form. All solutions (like registration of all controlled
trials with possibility of sudden on-site inspection, collection of data
with time-stamps in third place, etc.) will be very costly and eventually
will harm mostly innocent researchers due to rise in research
expenditures.
With future of statistical fraud control bleak, more discussion
should be directed to possibilities to decrease incentives connected with
fraud. Impact of research fraud would be lessened if consumers of
scientific information will remember about and demand reproducibility.
This, in turn, should influence decisions of Institutional Review Boards
(IRB) on repeating RCTs. It is now considered unethical to repeat RCT if
previous one showed one treatment superior. IRB in different institution
should allow repetition of RCT if controversial treatment was used in a
previous RCT or body of other knowledge does not support results of
previous RCT.
Unfortunately, possibility of scientific fraud is relatively high. In
a recent survey (3) 0.3% of US scientists funded by NIH confessed that
they had falsified or ‘cooked’ research data and almost every seventh
(15.3%) indicated that they dropped observations or data points from
analysis based on a gut feeling.
It is now probably the time to consider researcher as a source of
possible bias that is not controlled by use of randomization and use for
decision-making regarding RCT the same causation criteria of strength,
consistency, specificity, relationship in time, biological gradient,
biological plausibility, coherence of evidence, experiment and analogy
that were put forward by Sir Austin Bradford Hill and are standard for
assessing causation in non-randomized studies.
1. Al-Marzouki S, Evans S, Marshall T, Roberts I. Are these data
real? Statistical methods for the detection of data fabrication in
clinical trials BMJ 2005;331:267-270.
2. White C., Suspected research fraud: difficulties of getting at the
truth. BMJ 2005;331:281-288
3. Martinson BC, Anderson MS, de Vries R. Scientists behaving badly.
Nature 2005; 435; 737-738
Competing interests:
None declared
Competing interests: No competing interests