Intended for healthcare professionals


Gulf war illness—better, worse, or just the same? A cohort study

BMJ 2003; 327 doi: (Published 11 December 2003) Cite this as: BMJ 2003;327:1370
  1. Matthew Hotopf, reader (m.hotopf{at},
  2. Anthony S David, professor1,
  3. Lisa Hull, research assistant1,
  4. Vasilis Nikalaou, statistician1,
  5. Catherine Unwin, study coordinator1,
  6. Simon Wessely, professor1
  1. 1Gulf War Illnesses Research Unit, Department of Psychological Medicine, Guy's, King's, and St Thomas's School of Medicine, London SE5 8AZ
  1. Correspondence to: M Hotopf


    Objectives Firstly, to describe changes in the health of Gulf war veterans studied in a previous occupational cohort study and to compare outcome with comparable non-deployed military personnel. Secondly, to determine whether differences in prevalence between Gulf veterans and controls at follow up can be explained by greater persistence or greater incidence of disorders.

    Design Occupational cohort study in the form of a postal survey.

    Participants Military personnel who served in the 1991 Persian Gulf war; personnel who served on peacekeeping duties to bosnia; military personnel who were deployed elsewhere (“Era” controls). All participants had responded to a previous survey.

    Setting United Kingdom.

    Main outcome measures Self reported fatigue measured on the Chalder fatigue scale; psychological distress measured on the general health questionnaire, physical functioning and health perception on the SF-36; and a count of physical symptoms.

    Results Gulf war veterans experienced a modest reduction in prevalence of fatigue (48.8% at stage 1, 43.4% at stage 2) and psychological distress (40.0% stage 1, 37.1% stage 2) but a slight worsening of physical functioning on the SF-36 (90.3 stage 1, 88.7 stage 2). Compared with the other cohorts Gulf veterans continued to experience poorer health on all outcomes, although physical functioning also declined in bosnia veterans. Era controls showed both lower incidence of fatigue than Gulf veterans, and both comparison groups showed less persistence of fatigue compared with Gulf veterans.

    Conclusions Gulf war veterans remain a group with many symptoms of ill health. The excess of illness at follow up is explained by both higher incidence and greater persistence of symptoms.


    Consensus exists that service in the 1991 Persian Gulf war resulted in increased symptomatic ill health among those deployed.18 We know of no studies on the prognosis of symptoms among Gulf war veterans. In 1997 we studied a large random sample of members of the armed forces who served in the 1991 Gulf war,1 including those who had left the services. We compared the “Gulf cohort” with two military control cohorts. This study assesses the outcomes of these cohorts four years later. Our two main aims were, firstly, to compare the prevalence of various health outcomes over time and between cohorts, and, secondly, to determine rates of incidence and remission for clinically important fatigue and psychological distress after adjusting for potential confounders.



    Our original study consisted of three groups: personnel who served in the Persian Gulf war between 1 September 1990 and 30 June 1991 (the Gulf cohort); personnel who served on UN peacekeeping duties in bosnia between 1 April 1992 and 6 February 1997 (the bosnia cohort); and personnel who were serving in the armed forces on 1 January 1991 but who were not deployed to the Gulf (the “Era” cohort).1 We took a random sample of all Gulf veterans, with oversampling of women. Sampling of the other two cohorts was frequency matched in terms of sex, age, reservist status, officer status, service (Royal Navy, Army, or Royal Air Force), and a measure of fitness.

    Of 8196 participants who responded to the first survey 503 refused permission for future contact and 449 failed to complete the relevant section of the questionnaire. We used random stratified sampling to select respondents from stage 1 into the present study. All women were selected. We stratified the sampling on the severity of fatigue at stage 1. The selection process included all male veterans with a fatigue score greater than 8 (511 Gulf, 115 bosnia, and 120 Era); for Gulf, a 50% sample of veterans with fatigue scores of 4-8 (484 veterans), along with all those in bosnia (n = 333) and Era (n = 364) who scored in this range; and an approximately one in eight sample of veterans with fatigue scores less than 4 in order to represent asymptomatic individuals (n = 250 in each group).

    Mailing method

    We used three mailings. To trace non-responders we used the NHS central registry to obtain health authority ciphers and current addresses. We used the online electoral registry “Cameo” to check addresses. Service pension and discharge sources supplied updated addresses. We sent the second and third mailings via commanding officers, asking for their help in disseminating the questionnaires on our behalf. Following an agreement with the War Pensions Agency, the UK Department of Social Security sent two further mailings. In order to comply with data protection regulation, we were not informed which addresses the Department of Social Security had on their records.

    Questionnaire and outcomes

    The questionnaire included a fatigue scale9; the 12 item general health questionnaire (a screening questionnaire for common mental disorders)10; the SF-36 instrument for physical health and functional capacity 1113; and a list of 50 common symptoms. We defined cases of fatigue as having a score on the fatigue scale of greater than 3 and cases of psychological distress as having a score greater than 2 on the general health questionnaire. We defined cases of “stress reaction” from a checklist of symptoms described in previous work.1

    Statistical analyses

    Response bias—We defined four groups—responders, “refusers,” “returns to sender” (where questionnaires were returned to us), and “no information” (where no reply was forthcoming). We compared the frequency of these four outcomes by cohort. We determined whether response bias was present by comparing demographic variables and stage 1 health outcomes across cohorts, using Scheffé's test.14

    Follow up health outcomes—To take account of the sampling strategy all analyses used sampling weights and robust standard errors by using the appropriate commands in Stata (StataCorp, College Station, TX). We calculated the prevalence of binary outcome variables and present these in relation to baseline scores. For binary outcomes we present the matched odds ratio, which is the proportion of incident cases to recovered cases for each outcome. For continuous outcomes we present stage 1 and 2 scores and mean differences with 95% confidence intervals.


    Response rates

    The response rate for those eligible to receive a questionnaire was 71.6% (table 1). The response rate was higher in the Gulf cohort than in the other two cohorts (P = 0.03). A similar pattern emerged in terms of types of non-respondents in Gulf and bosnia, but the Era group had a higher proportion of refusers than the other two cohorts. Response rates were lower in men, younger participants, and those who were unmarried. Nonresponders rated their health as poorer at stage 1 for physical disability and general health perception but were less likely to have been cases on the general health questionnaire.

    Table 1

    Characteristics of responders and non-responders in a cohort study among Gulf war veterans. Values are numbers (percentages) unless otherwise indicated

    View this table:

    Table 2 shows the sociodemographic characteristics of the participants and indicates that Gulf and Era were broadly similar. The bosnia group were younger, less likely to be married, more likely to have remained in service, and only from the Army.

    Table 2

    Characteristics of the three cohorts. Values are numbers (percentages) unless otherwise indicated

    View this table:

    Table 3 shows the prevalence of categorical outcomes at stages 1 and 2. We report the prevalence of stage 1 outcomes within the sample studied at stage 2, not for the entire cohort—hence the prevalence figures we report for stage 1 are similar, but not identical to, those shown in our previous paper.1 Table 3 shows that Gulf had higher rates of the disorders under study than the other two cohorts, and this difference is maintained between stages 1 and 2. For Gulf we found a modest reduction in the prevalence of fatigue, post-traumatic stress reaction, general health questionnaire cases, and self reported “Gulf war syndrome.” For bosnia and Era we found no changes other than an increase in the prevalence of post-traumatic stress reaction in the Era group, which was not significant (P > 0.05).

    Table 3

    Prevalence of categorical outcomes in the three cohorts. Values are percentages (95% confidence intervals) unless otherwise indicated

    View this table:

    Table 4 shows the scores on continuous measures at each time point for the three cohorts. The Gulf cohort was less healthy than the other two cohorts at both stages. A decline in physical functioning affected each of the three cohorts (non-significant for Era). Health perception declined for both bosnia and Era but not for Gulf. The Gulf veterans showed a modest reduction in fatigue scores and non-significant but small reductions in general health questionnaire scores and total symptoms. The other two cohorts showed a general tendency to experience more symptoms over time; six changes were significant (P < 0.05).

    Table 4

    Scores (95% confidence intervals) for continuous measures by cohort and stage

    View this table:

    Because it was possible that differences in prevalence between cohorts could have been explained by either higher incidence, or greater persistence, of symptoms this was explored in table 5. The incidence risk for fatigue and general health questionnaire caseness was lower in Era than the other two cohorts. Controlling for stage 1 sociodemographic variables reduced the differences, but the Era group remained less likely to experience new fatigue than the Gulf group. The Gulf group were more likely to experience persistent fatigue compared with the Era and bosnia cohorts, an effect that remained significant after controlling for potential confounders (P = 0.009).

    Table 5

    Incidence and persistence of outcomes. Values are presented with 95% confidence intervals

    View this table:


    Main findings

    Gulf war veterans continue to experience symptoms that are considerably worse than would be expected in an equivalent cohort of military personnel. However, Gulf war veterans are not deteriorating and do not have a higher incidence of new illnesses.

    Our study had two main aims. Firstly, we wanted to describe the outcome of Gulf war veterans three to four years after we first surveyed them. Our results show a disappointing stability in the prevalence of the main disorders we studied. Although the prevalence of the symptom based disorders lessened for Gulf veterans, physical functioning and health perception measured on the SF-36 barely changed. The reduced physical functioning may have been due to increasing age. The two comparison groups had some worsening in health on the SF-36 scales and more physical symptoms. This implies that some worsening of these health outcomes is expected over time, presumably due to advancing age. At each wave the prevalence between Gulf and the two comparison groups differs, and, although the gap narrowed slightly, the Gulf veterans continued to experience poorer health on all measures.

    Our second aim was to examine whether the raised prevalence in Gulf veterans was explained by a greater incidence of disorders or more persistence. No clear pattern emerged. We found some evidence that the incidence of fatigue and caseness on the general health questionnaire was higher in the two deployed groups (bosnia and Gulf) than in Era but that the difference for the general health questionnaire was explained by confounding. For fatigue we found evidence that the Gulf group continued to have greater incidence than the Era group, and this was not explained by confounding. For persistence we found a title trend for Gulf veterans with fatigue to be more likely to remain fatigued at follow up.

    Limitations of the study

    We achieved a follow up rate of just over 70%, which leaves room for bias. because follow up rates were worse in participants with poorer health at stage 1 we have probably slightly underestimated the prevalence of the disorders under study. The effect of this bias seems to have been similar across cohorts, which makes it unlikely that the non-response bias would have led to the Gulf group having still higher than expected prevalence figures at stage 1. We believe that the missing values are unlikely to have changed the main findings of this study. We measured health on a range of self report items, which are open to reporting biases.


    The nature of Gulf war illness remains ambiguous. If the illness represented the prodrome of a known disease (such as a neurological disorder), even with the passage of time, this has yet to declare itself.15 16 We think that the non-specific increase in symptoms reported by our and other studies is likely to remain poorly understood in terms of conventional biomedical diseases. This study did not have the statistical power to assess mortality in Gulf war veterans, and this was not our aim. Other studies in the United Kingdom and the United States, however, have failed to find higher than expected death rates.17 18 Finally, as time passes it becomes increasingly difficult to find causes of illnesses in veterans of the 1990-1 Gulf war. We suspect that different psychosocial, military and environmental risk factors may determine onset and recovery, and this is the topic of future research.

    What is already known on this topic

    Veterans of the 1990-1 Gulf war experience poorer health on most subjective outcomes than non-deployed military personnel

    No satisfactory follow up studies have assessed outcome of veterans of the Gulf war over more than one wave of data collection, so it is unclear whether veterans are getting worse, staying the same, or getting better

    What this study adds

    Gulf war veterans still have considerably poorer subjective health than appropriate military controls

    The health of Gulf war veterans has improved, but this improvement is relatively minor

    For comparison groups there has been a worsening of health on some outcomes, which is probably due to ageing

    The health gap between Gulf war veterans and comparison groups has therefore narrowed slightly

    Editorial by Clauw and p 1373


    We thank Nick blatchley and Simon Satchell from the Gulf Veterans Illness Unit of the Ministry of Defence for assistance in tracking participants, Michael Dewey for statistical advice, and the participants for their patience in once again completing lengthy questionnaires.


    • Contributors MH, ASD, and SW designed the study. CU snf LH were responsible for data collection under the supervision of SW, ASD, and MH. MH and VN ptrformed the statistical analyses. MH wrote the paper and is the guarantor. All authors provided comments

    • Funding US Department of Defense, UK Medical Research Council

    • Competing interests None declared

    • Ethical approval The study received approval from the relevant research ethics committees