Rapid Responses to:

EDUCATION AND DEBATE:
David J Torgerson and Marion K Campbell
Economics notes: Cost effectiveness calculations and sample size
BMJ 2000; 321: 697 [Full text]
*Rapid Responses: Submit a response to this article

Rapid Responses published:

[Read Rapid Response] Demonstrating cost effectiveness requires larger sample
Graham Byrnes   (15 September 2000)
[Read Rapid Response] Costing application for sample size determination
ryner jose c carrillo   (7 October 2002)

Demonstrating cost effectiveness requires larger sample 15 September 2000
 Next Rapid Response Top
Graham Byrnes,
Biostatistician
Victorian Infectious Diseases Reference Lab

Send response to journal:
Re: Demonstrating cost effectiveness requires larger sample

The article by Torgerson and Campbell [1] suggests using the total cost break-even point as the hypothetical effect size for the purpose of power and sample size calculations. While this criterion appears as good as many others, it could lead to the mistaken conclusion that a study designed in this way can test cost efectiveness.

In fact, if the sample size is chosen as in [1] with a type 1 error rate of 5%, then the power to detect lower overall cost is only 2.5% at the same 5% level!

Take the example of comparing endometrial laser ablation with transcervical endometrial resection, as discussed in [1]. The proposed study design with 435 patients per arm has a power of 80% to detect a significantly lower rate of re-treatment after laser ablation than the 27% observed for transcervical resection. However to establish lower overall cost, we must show that the re-treatment rate is significantly lower than the break-even point of 19%. The probability of observing a rate significantly lower than 19% under the hypothesis that the rate is 19% is, unsurprisingly, very low. In fact it is precisely half the type 1 error rate of 5%.

So the proposed cost-effectiveness criterion for selecting a hypothetical effect size cannot be used for testing cost effectiveness. This leaves its rationale looking rather slight. In fact I support the "logistic" procedure that the authors appear to denigrate in [1]: calculating the effect size which yields a practical sample size. This is an effective way of reducing the arcane logic of power calculations to a parameter that a clinician can use to decide if the trial should proceed.

I believe this is compatible with Goodman's agenda [2] to incorporate clinical understanding into medical statistics. It is quite different to using post-hoc power calculations to explain away negative results [3].

Regards,
Graham Byrnes

[1] Torgerson DJ, Campbell MK. Cost effectiveness calculations and sample size. BMJ 2000;321:697.

[2] Goodman SN. Towards evidence-based medical statistics. 1: The p-value fallacy. Ann Intern Med 1999;130: 995-1004.

[3] Goodman SN. The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting the results. Ann Intern Med 1994;121: 200-206.

Costing application for sample size determination 7 October 2002
Previous Rapid Response  Top
ryner jose c carrillo,
resident physician
PGH Taft Ave Manila Philippines 1000

Send response to journal:
Re: Costing application for sample size determination

Cost-benefit studies analyze outcomes in terms of money. This method has always been a point of debate such that its difficult to put an actual amount value for a certain outcome. As such, still only a number of studies are available.

Due to this type of analysis, sample size has tremendously increased in order to detect significant minute differences across treatment choices in designing studies. In designing two arm studies, what has been termed as “clinically” significant difference, is set by investigators based on expert opinion. This value, which is important in calculating the sample size for a study, may increase and decrease depending more often on the ability to gather the number of subjects for a study or cost of the trial.

Example: Comparing two treatment with outcome measured in proportions.

N= Za (root2PQ)+ Zb root(2P1Q1 + 2P2Q2)/ D squared D = difference between the two proportions that is desired to be detected

However, in designing studies with two arms, the D or the difference between the two can be objectively termed as “clinically significant difference” by assigning value to the treament choices. By assigning value to treatment and outcome, e.g., treatment one is P2000 and outcome prevented is P5000, how many treatment one is equivalent to preventing an outcome.

Hence, D is a value which, when statistically significant, will make us choose between treatment one versus placebo.

With concept of number needed to treat, i.e., number of cases to treat in order to prevent the primary outcome, the desired number needed to treat can be derived using:

(Cost of treatment one)x(# of cases treated)=

(cost of outcome)x(number of outcome prevented)

Hypothetical Example: Cost of treatment = P2000 (e.g. chronic otic antibiotic drops in otitis media) Cost of outcome = P5000 (e.g. a tympanomastoidectomy)

(P2000) (# of treated cases) = (P5000) (# of outcome)

(# of treated cases)/(# of outcome prevented)= (P5000)x(P2000)

NNT = 2.5

hypothetical ARI = 1/NNT = 0.4

D = 0.4

ARI = 0.4 which is actually a desired difference such that when exceeded in the study, would decide whether the treatment is better.

And given the sample size formula, one can compute for the needed sample to detect a “clinically significant difference” based on costing.

Costing an outcome however will be controversial from one specialist over another. Outcomes like patients ending up in a surgical procedure confined for this number of particular days who is projected to earn and be productive for a value for each day is relatively easy to assess. Costing life, i.e., mortality is a different matter.

However, in a country where resources are very limited, a perceived “clinically significant difference” should include costing as an important factor. As such a study that is designed to investigate a novel treatment or procedure versus an established one must account for costing. A statistically significant difference of 0.0001, where number needed to treat is big (1/0.0001 = 10,000) as well as the cost, may not be clinically significant if the cost outcome being considered is relatively small.

It is true that larger sample size may be required to detect significant small differences necessary in studies that involves costing- effectiveness analysis. However, a preliminary costing analysis can be applied to determine a minimum sufficient sample in designing a study. These provides a more measurable basis for determining “what a clinically significant difference is”.