Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
Julie A Barber Department of
Medical Statistics and Evaluation, Imperial College School of Medicine,
London W12 0NN
Correspondence to: Ms
Barber j.barber{at}rpms.ac.uk
Objective To review critically the statistical
methods used for health economic evaluations in randomised controlled trials where an estimate of cost is available for each patient in the
study.
With the continuing development of new treatments and medical
technologies, health economic evaluations have become increasingly important. To identify cost effective care, providers, purchasers, and
policy makers need reliable information about the costs as well as the
clinical effectiveness of alternative treatments. For clinical
outcomes, randomised controlled trials are the standard and accepted
approach for evaluating interventions. This design provides the most
scientifically rigorous methodology and avoids the biases which limit
the usefulness of alternative non-randomised designs.1
Pragmatic randomised controlled trials provide a suitable environment
not only for assessing clinical effectiveness but also for comparing
costs,2-4 and an increasingly large amount of economic
data is being collected within trials.
5 6
The costs of competing treatments are usually estimated using
information about the quantities of resources used The cost associated with a treatment may be estimated as a
deterministic (fixed) value by costing a typical treatment protocol. This approach requires assumptions about the usual quantities of
healthcare resources that would be used during treatment. For the
surgical procedure example, this would involve assumptions about the
grades of staff present during the operation, the typical time taken,
consumables used, and length of inpatient stay. Carrying out an
economic evaluation alongside a randomised controlled trial, however,
allows detailed information to be collected about the quantities of
resources used by each patient in the study: a record would be kept for
every patient of the actual staff present, time taken, consumables
used, and inpatient stay. Such information allows an estimate of the
cost of treatment to be obtained for each individual patient, producing
a set of cost values, which will be referred to as "patient
specific" cost data.
Availability of patient specific cost data not only allows the use of
statistical inference as a basis for drawing conclusions about costs
but reduces the extent to which the comparison between randomised
groups is based on assumptions about resource use. In addition it
allows the relation between costs and other factors such as patient
characteristics and clinical outcomes to be investigated.
In trials where patient specific cost data are available, the
comparison of costs between treatment groups is used to make inferences
about the true cost difference in the population from which the trial
sample was drawn. The evidence from the sample needs to be assessed
using statistical analysis. Although several reviews of economic
evaluations have been undertaken,
5 7-15
to date none
has concentrated specifically on statistical aspects of the analysis of
patient specific cost data from randomised controlled trials. We
therefore focused on this issue, aiming to assess the use of
statistical methods in this context and whether the conclusions drawn
for costs are properly justified.
Selection of study articles
![]()
Abstract
Top
Abstract
Introduction
Methods
Results
Discussion
References
Design Survey of published randomised trials
including an economic evaluation with cost values suitable for
statistical analysis; 45 such trials published in 1995 were identified
from Medline.
Main outcome measures The use of statistical methods
for cost data was assessed in terms of the descriptive statistics reported, use of statistical inference, and whether the reported conclusions were justified.
Results Although all 45 trials reviewed apparently
had cost data for each patient, only 9 (20%) reported adequate
measures of variability for these data and only 25 (56%) gave results
of statistical tests or a measure of precision for the comparison of
costs between the randomised groups. Only 16 (36%) of the articles gave conclusions which were justified on the basis of results presented
in the paper. No paper reported sample size calculations for costs.
Conclusions The analysis and interpretation of cost
data from published trials reveal a lack of statistical awareness. Strong and potentially misleading conclusions about the relative costs
of alternative therapies have often been reported in the absence of
supporting statistical evidence. Improvements in the analysis and
reporting of health economic assessments are urgently required. Health
economic guidelines need to be revised to incorporate more detailed
statistical advice.
Key messages
![]()
Introduction
Top
Abstract
Introduction
Methods
Results
Discussion
References
that is, the set of
cost generating items which make up the treatment and its consequences.
For example, the resources used in a surgical operation may include the
staff time involved, the consumables used, and the length of a
subsequent inpatient stay. To estimate the cost of treatment, this
resource use information is combined with unit cost estimates, which
give a fixed monetary value to each cost generating item. The total
cost of treatment is then the weighted sum of the quantities of
resources used, where the weights are the unit costs.
![]()
Methods
Top
Abstract
Introduction
Methods
Results
Discussion
References
Published papers included in this review are those which reported
on randomised trials where patient specific cost data were available,
on which statistical methods were or could have been used. The search
was limited to publications in English, involving human subjects, and
published during 1995 and was carried out using the Medline database as
of April 1997. The search required at least one of "trial" or
"intervention(s)" and at least one of "health economic(s),"
"economic evaluation," or "cost(s)" in the title, abstract or
MeSH headings. The search identified 872 eligible articles.
Information collected
A data collection form was developed and was completed on reading
each article in the review. This included information about the
collection and calculation of costs, sample size calculations cited,
summary measures reported, and statistical methods used. The final part
of the assessment judged the appropriateness of any inferential
conclusions drawn about costs, given the statistical results presented
in the paper. These judgments did not involve consideration of design
issues or methods of analysis but were simply based on cost estimates
and any P values or confidence intervals reported.
| |
Results |
|---|
|
|
|---|
Description of papers
The 45 papers identified came from both specialist and more
general journals, and covered a wide variety of clinical areas
including cancer, heart disease, nursing, and psychiatry. About half
(24; 53%) were primary publications for the trial which usually
included both clinical and economic results. In many of these, the
economic component was rather small and lacking in detail. The
remaining papers (21; 47%) were "follow on" papers to the main
effectiveness analyses, which reported cost results either alone or in
combination with other outcomes of interest, such as quality of life.
The vast majority of the studies were designed as pragmatic trials,
directly relevant to clinical practice; the economic analysis thus had
direct policy implications.
Sample size calculations
Sample size calculations were mentioned in only seven (16%) of
the 45 articles in the review. None were for economic outcomes; six
were based on clinical endpoints, and in the remaining case it was
unclear which outcomes were being considered. In the case of health
economic assessments published separately from the main effectiveness
analyses, sample size calculations for clinical outcomes may have been
reported elsewhere.
Descriptive statistics
One trial in the review, which compared four three day
antimicrobial regimens for treatment of acute cystitis, found mean
costs (US$) per patient of $114 for patients treated with
trimethoprim-sulpamethoxazole, $131 for amoxicillin, $155 for
nitrofurantoin, and $155 for cefadroxil.16 No information on the variability or ranges of costs per patient were given, so it is
impossible to judge to what extent the average presented was typical
for the patients studied. In a trial of whether to re-evaluate patients
receiving oxygen at home at intervals of two months or six months, the
mean cost and standard deviation over one year were presented for each
group in the trial.17 For example, in the six month
re-evaluation group the standard deviation was larger than the mean
($11 580 and $8870 respectively), indicating a very wide dispersion of
costs between individuals. This information helps to put the mean costs
observed into perspective.
that is, the simple
average cost. This is because policy makers, purchasers, and providers
need to know the total cost of implementing the treatment. This total
cost is estimated as the arithmetic mean cost in the trial, multiplied
by the number of patients to be treated. Measures other than the
arithmetic mean (such as the median, mode, or geometric mean) cannot
provide an estimate of total cost. The fact that the distribution of
costs is often highly skewed does not imply that the use of the
arithmetic mean is inappropriate. However, describing the variability
in costs between individuals in the trial, and any peculiarities in the
shape of the distribution such as skewness, is also important.
The figure shows the percentage of all the papers reviewed reporting
various summary measures for the cost data in each randomised group.
Overall 42 papers (93%) reported measures of location, which were
given as arithmetic mean or total costs in all but two articles. Six
papers reported other measures of location along with the mean, five
giving medians and one presenting modes in each
group.
|
Inferential statistics
Inferences made about costs need to be supported by a measure of
precision (standard error or confidence interval) of the difference in
mean costs between randomised groups, or at least a P value. For
example, a study of induction of labour versus serial antenatal
monitoring reported that the mean cost (Canadian $) in the monitoring
group was higher by $193 (95% confidence interval $133 to $252,
P<0.0001).18 In contrast, a study of midwife team versus
routine care during pregnancy and birth simply reported that the
average cost (Australian $) per delivery was "$3324 for team care
women and $3475 for routine care women, resulting in a saving of $151,
or a 4.5% reduction in costs."19 In the latter example,
no inference is justified since the precision of these cited quantities
is unknown.
Justification for conclusions
For the study of induction of labour versus serial antenatal
monitoring mentioned at the beginning of the previous section, the
authors concluded in the abstract that "a policy of managing
post-term pregnancy through induction of
labour...results in lower cost."18
This is an inferential conclusion that could be extrapolated from the
trial results to future policy, and it is justified in terms of the
confidence interval and P value for the mean cost difference presented.
The trial of midwife team care versus routine care concluded that
"the team approach...was associated with a
reduction in costs per woman."19 This would also be
likely to be interpreted as an inferential statement by readers.
However, it was based simply on a comparison of mean costs, without any
information on the precision of the mean cost difference observed. It
is not a justified conclusion.
|
Missing data, cost effectiveness, and sensitivity analyses
Information concerning the completeness of the cost data was given
for only 24 studies (53%). Of these, three mentioned that their data
were complete and 21 stated that some data were missing, the amount
ranging up to 35% of the sample. Eleven papers apparently excluded
subjects with missing cost data from the analysis without any further
investigation. Five others compared characteristics of this group of
patients with those whose data were complete, in order to identify any
obvious biases. Four further papers dealt with missing data in other
ways: one used a sensitivity analysis, another imputed values, and two
used longitudinal analyses which do not require the data to be complete at all time points.
for example, cost per quality adjusted life year, cost per year of life gained, or cost per unit change in some clinical measurement. None of these papers carried out statistical tests for the
cost effectiveness estimates or used confidence intervals to report on
their precision. Two, however, used the confidence intervals of the
effects, and in one case costs, to consider extreme cases of the cost
effectiveness ratio.
Only 11 (24%) of the 45 studies reported having carried out
sensitivity analyses, and in five cases these were for the cost effectiveness results. The sensitivity analyses investigated robustness to various assumptions including unit costs, cost to charge ratios, assumed resource use values, and discount rates.
| |
Discussion |
|---|
|
|
|---|
Randomised controlled trials are not always the appropriate vehicle to address economic questions, 20 21 and there is an important role for other methods of economic evaluation, such as modelling.22 When economic evaluations are carried out alongside randomised controlled trials, however, the cost data collected should be interpreted appropriately. This review has revealed major deficiencies in the way cost data in randomised controlled trials are summarised and analysed.
Descriptive statistics
In providing descriptive information for continuous data, such as
costs, recommended practice23 would be to present a
measure of location (for example, mean or median) and variability (for
example, standard deviation or interquartile range) and mention any
peculiarities about the shape of the distribution (such as skewness).
Cost data are typically highly skewed, because a few patients incur
particularly high costs. The arithmetic mean is then larger than the
median, sometimes substantially, because it is more influenced by these
high costs. Although the median can be interpreted as the most
"typical" cost for individual subjects, since half of them have
costs below this value and half above, it is the arithmetic mean cost
that is important for policy decisions. It is only the arithmetic
mean
not other measures such as the median, mode or geometric
mean
that, when multiplied by the number of patients to be treated,
estimates the total cost that would be incurred if the treatment were
implemented. Although these other measures are commonly used for skewed
data in other circumstances, the more informative arithmetic mean
should always be reported for costs. This was done in nearly all the
papers in our review, but statistical comparisons often used methods
that did not directly compare these arithmetic means.
Inferential statistics
The interpretation of patient specific cost data in randomised
controlled trials needs to be guided by formal methods of statistical
inference
but only half of the papers reviewed presented a P value or
confidence interval for cost comparisons. Conclusions regarding the
evidence about cost differences cannot reliably be made without such
statistical analysis. Among the papers that used statistical analysis,
half used inappropriate methods (such as the non-parametric
Mann-Whitney U test, or analysis of log transformed costs) that do not
compare of arithmetic mean costs. Only 11% of the papers presented a
confidence interval for the average cost difference, although the use
of confidence intervals has repeatedly been recommended in statistical
guidelines.
23 24
Sample size calculations
The often large variability in costs between individuals
emphasises the need to perform economic evaluations on sufficiently
large samples so that precise conclusions can be drawn. The rationale
for sample size calculations (having adequate power for the planned
analyses and having a predetermined stopping point) are as relevant to
cost outcomes as to clinical outcomes. Although cost outcomes are often
regarded as "secondary," they are still important. There may be
practical reasons to base the health economic evaluation on a subset of
the whole trial but statistical justification is lacking. The use of
subsets and the complete absence of sample size calculations
reportedfor costs in this review indicates the large scope for
improvement in the rational planning of economic evaluations.
Completeness and relevance of the review
The review was based on papers published in 1995 accessed through
Medline. Limiting the search to journals on a single database means
that this may not be an exhaustive review of all relevant papers. The
reporting standards of journals cited by Medline, however, are likely
to be better than those of non-Medline journals, therefore producing an
overly optimistic view of the use of statistical methods in economic
evaluations. The results of a similar search using the Cochrane
Controlled Trials Register included 43 of the 45 papers in this review
(the other two were both follow on papers to a main clinical
effectiveness publication and in both cases only the clinical paper
appeared in the Cochrane register). The Medline search may not have
identified absolutely all randomised controlled trials with patient
specific costs.26 Some trials were excluded from the
review because it was not clear from their methods whether patient
specific cost data had been collected; however, these trials presented
no measures of variability or statistical inferences for costs.
Statistical complexities
The statistical issues in analysing cost data are not, however,
all straightforward,31 in particular how to compare
arithmetic mean costs in very skewed data. Standard methods for
analysing arithmetic means such as the t test are known
to be fairly robust to non-normality. This robustness, however, depends on several features of the data, in particular sample size and severity
of skewness. There are no set criteria by which to judge whether the
analysis will be robust for a particular dataset, and relying on
standard methods could produce misleading results, especially if sample
sizes are small. Extending simple comparisons to adjust for baseline
variables may exacerbate the problems. Both simple and more complex
analyses of costs can, however, be carried out or checked using
bootstrapping.32 This approach allows a comparison of
arithmetic means without making any assumptions about the cost
distribution. Although some examples of the use of bootstrapping for
cost data have recently been published,33 this method is
not yet routinely used by medical researchers.
Conclusion
This review has shown that there is an urgent need to improve the
statistical analysis and interpretation of cost data in randomised
controlled trials. The BMJ guidelines and other health economics
guidelines need to be revised to incorporate more detailed statistical
advice for researchers, editors, and reviewers when dealing with
patient specific cost data from trials. These guidelines not only need
to encourage the use of statistical inference but need to provide
advice on dealing with some of the more complex issues mentioned
above.
| |
Acknowledgments |
|---|
Contributors: JB and ST were both involved in developing the ideas and methods for the paper, and in writing the text, and are guarantors of the paper. JB took the principal role in extracting the relevant information from the papers in the review and summarising the results.
Funding: JB was funded by North Thames NHS Executive; ST was funded by HEFC London University.
Competing interests: None declared.
| |
References |
|---|
|
|
|---|
(Accepted 8 October 1998)