The papers that were assessed were graded against the following explicit criteria.
Randomised controlled trials
A. Methodological judgement
1. Were the patients and the investigators blinded to the treatment?
If not and blinding was appropriate (e.g. drug trials) then reject - if open design was appropriate (e.g. management issues) then were data collectors blinded.
2. Were the group sizes greater than 20?
Reject any studies with group sizes less than 20 whether showing a statistically significant positive or negative result. The analysis of such studies will require statistical correction (Yates' correction or the use of small sample size tests such as Fischer's exact test). In addition, it was felt that to guide practice it was reasonable to use studies that were of a certain minimum size. When Kleijnen et al. (1991) looked at size criteria in a review of clinical trials of homeopathy they argued that "in trials with limited numbers of participants one cannot be confident that randomisation will equally divide known and unknown confounders over the experimental and control groups. As well, publication bias may be less likely for experiments with large numbers of participants." As a result of this they did not score trials with group sizes less than 25. In a similar review of physiotherapy Koes et al. (1991) put forward the same argument and did not score trials with group sizes of 50 or less.
3. Were the groups appropriately randomised?
Even where the groups have been assigned randomly, the article should include comparative statistics of the two groups to show them to be equivalent (age, severity etc).
4. What outcomes were measured ?
The main endpoints should be listed and the following variables looked at for each outcome.
5. Was there at least 80% follow-up?
For each endpoint considered there should be at least 80 % randomised sample included in the analysis. This should take the issue of compliance into account (needs to be equal in each group).
6. What was the statistical significance level?
This is usually given by the p value (chance of making a Type I error) or by a confidence interval. If the outcome is not significant (i.e. p > 0.05) then are the rates of events in each group quoted? This will enable the calculation of the probability of a Type II error.
7. If the study was not significant, was the sample size large enough?
Any studies with groups smaller than 20 will already have been rejected. For other papers, where possible, look for the rate of events (Bailar and Mosteller, 1988; Guyatt et al., 1994) in each group and then use sample size estimation tables (Detsky amd Sackett, 1985) for 25% risk reduction to decide whether the original groups were large enough (25% risk reduction was chosen as clinically significant for our purpose).
Points 2 and 5 were considered absolute flaws - if an article failed to meet any of these criteria then it was rejected on methodology and not passed on for clinical assessment.
B. Clinical judgement
1. Were the outcomes true or substitute?
Need to be careful about generalising from substitute endpoints.
2. Was the follow-up time period appropriate?
Was the trial follow-up long enough to show up full effects/side-effects etc?
3. Can the findings from this study be generalised to our study population?
Are the sample characteristics similar enough? - age, symptoms etc.
4. Were the differences in outcome clinically important?
Are the conclusions of interest? Are they relevant to the primary care management of our patient group? Are the differences important enough to want to change current practice?
Cohort and case control studies
Cohort study: prospective follow-up of two groups of non randomly allocated patients - one exposed to a treatment/factor and one not.
Case control study: retrospective review of a group of patients with a disease and comparison with a non-randomly chosen matched group which does not have the disease.
NB. In terms of establishing causality these study designs are weaker than experimental studies, are more prone to bias and should only be used if experimental study evidence is not available. However, they will be the only realistic study design in certain circumstances e.g. when studying rare events.
A. Methodological judgement
1. Are diagnostic criteria stated clearly with explicit inclusion and exclusion criteria?
Is the population relevant to our task?
2. Are the matching criteria clearly stated? Are they adequate?
Demographic, geographic, socio-economic, risk factors etc.
3. Was data collection blinded?
Biases are reduced if an interviewer or data collector is blinded to the patient's grouping. In some cases the patient may also be blinded as far as they may not know which are the key questions in an interview. Few of these studies, however, tend to be blinded.
4. Were confounding variables and biases considered?
Have confounding factors been taken into account in the analysis - if not conclusions should be treated with caution.
5. Which outcomes or distinctions were looked for?
List them.
6. Non-response rates
Is the data set complete - reject if less than 80% complete. If alternative sources of data have been used then these should be clearly specified.
7. Relative Risk or Relative Odds (include confidence intervals)
If RR < 1 then the treatment has a protective value.
If RR > 1 then the treatment has a detrimental effect.
If RR = 1 there is no treatment effect.
Need to look at the confidence intervals - if these include 1 then the result is not significant. If however the range includes a clinically significant level (e.g. 1.5) then other papers should be consulted to be sure that the non significant result was not due to low power.
8. Was the study powerful enough?
If confidence intervals include RR>1.5 then the study may not be powerful enough.
Points 1 and 2 were considered fatal flaws - if an article failed to meet any of these criteria then it was rejected on methodology and not passed on for clinical assessment.
B. Clinical judgement
1. Were the outcomes true or substitute?
2. Was the follow-up time period appropriate?
3. Was the outcome clinically significant?
Review/ overview/ meta-analyses grid
A. Methodological judgement
1. Were appropriate search methods used?
Did the reviewers use at least one electronic database or carry out a hand search plus either experts advice or citation searches?
2. Were appropriate selection criteria used?
Descriptions of original papers should include: blinding of patients, blinding of investigators, study type - do they fit in with our methodological criteria?
3. How many papers were included?
4. Were the sizes of individual studies considered ?
Meta-analysis should weight conclusions according to the sample size of the included studies.
5. Were the papers appropriate to combine?
Were they using the same populations?
Were they looking at the same endpoints?
6. Which outcomes were looked at?
List them.
If a review did not have a clear method section it was rejected for our evidence review. However the reference sections were checked as a supplementary search strategy.
B. Clinical judgement
1. Were these outcomes true or substitute?
2. What was the author's conclusion?
Does this follow directly from the data in the review?
3. Is this clinically important?
References
Bailar, J.C. and Mosteller, F. (1988) Guidelines for statistical reporting in articles for medical journals. Amplifications and explanations. Annals of Internal Medicine 108:266-273.
Detsky, A.S. and Sackett, D.L. (1985) When was a negative clinical trial big enough? How many patients you needed depends on what you found. Archives of Internal Medicine 145:709-712.
Guyatt, G.H., Sackett, D.L. and Cook, D.J. (1994) Users' guides to the medical literature. II: How to use an article about therapy or prevention. B. What were the results and will they help me in caring for my patients? Journal of the American Medical Association 271:59-63.
Kleijnen, J., Knipschild, P. and ter Riet, G. (1991) Clinical trials of homoeopathy. British Medical Journal 302:316-323.
Koes, B.W., Bouter, L.M., Beckerman, H., van der heijden, G.J.M.G. and Knipschild, P.G. (1991) Physiotherapy exercises and back pain: a blinded review. British Medical Journal 302:1572-1576.