Quantification of risk factors for herpes zoster: population based case-control study

Objectives To quantify the effects of possible risk factors for herpes zoster at different ages. Design Case-control study. Setting UK Clinical Practice Research Datalink primary care data. Participants 144 959 adults diagnosed with zoster between 2000 and 2011; 549 336 age, sex, and practice matched controls. Main outcome measures Conditional logistic regression was used to generate adjusted odds ratios to estimate the strength of association of each potential risk factor with zoster and assess effect modification by age. Results The median age of the cases and controls was 62 years. Factors associated with increased risk of zoster included rheumatoid arthritis (3111 (2.1%) v 8029 (1.5%); adjusted odds ratio 1.46, 99% confidence interval 1.38 to 1.55), inflammatory bowel disease (1851 (1.3%) v 5118 (0.9%); 1.36, 1.26 to 1.46), chronic obstructive pulmonary disease (6815 (4.7%) v 20 201 (3.7%); 1.32, 1.27 to 1.37), asthma (10 243 (7.1%) v 31 865 (5.8%); 1.21, 1.17 to 1.25), chronic kidney disease (8724 (6.0%) v 29 437 (5.4%); 1.14, 1.09 to 1.18), and depression (6830 (4.7%) v 22 052 (4.0%); 1.15, 1.10 to 1.20). Type 1, but not type 2, diabetes showed some association with zoster (adjusted odds ratio 1.27, 1.07 to 1.50). The relative effects of many assessed risk factors were larger in younger patients. Patients with severely immunosuppressive conditions were at greatest risk of zoster—for example, patients with lymphoma (adjusted odds ratio 3.90, 3.21 to 4.74) and myeloma (2.16, 1.84 to 2.53), who are not eligible for zoster vaccination. Conclusions A range of conditions were associated with increased risk of zoster. In general, the increased risk was proportionally greater in younger age groups. Current vaccines are contraindicated in people at the greatest risk of zoster, highlighting the need for alternative risk reduction strategies in these groups.

To define diabetes we required a definite diabetes diagnosis, or a possible diabetes code [e.g. selfmonitoring of blood glucose] with a subsequent diabetes-specific prescription [insulin or oral antidiabetics], or ≥2 diabetes drug prescriptions prior to the index date; gestational diabetes and druginduced diabetes were excluded. We also used age at first diagnosis, age at first treatment and treatment received to classify patients into Type 1 or 2 diabetes. Patients were categorised as type 1 or type 2 diabetes where possible. Distinguishing between type 1 and type 2 diabetes is not always possible from diabetes codes as patients are frequently given a non-specific code.
Furthermore, where type of diabetes is assigned, it has been found to be unreliable. [1] Therefore we chose not to use this information, but instead use age at first diagnosis, age at first treatment and treatment received to classify diabetes type, as in previous Clinical Practice Research Datalink studies. [2] [3] Type 1 was assigned where; age at first diagnosis was ≤35 years and treatment ever was exclusively insulin, or patients received at least two insulin prescriptions ≤35 years, but had no diabetes diagnosis. Type 2 was assigned where; age at first diabetes diagnosis was >35; or patients received exclusively oral anti-diabetics's >35 years. Patients with age at diagnoses >35 but treated exclusively with insulin and those not fitting into these categories were assigned as "Unknown type".

Calculating duration of prescriptions in the Clinical Practice Research Datalink and identifying highdose oral corticosteroids
In order to identify relevant prescriptions, it was necessary to calculate the duration of individual prescriptions prior to the index date.

Variables available
The Clinical Practice Research Datalink does not provide researchers with duration and dose for individual prescriptions. Instead, these must be generated using information from other variables.
On prescribing, general practitioners select the drug and can enter information on quantity (of tablets or inhalers prescribed), number of days, number of packs, pack type (number of tablets per pack or number of puffs per inhaler) and also enter free text. The free text field contains the actual prescribing information; in other words how many tablets, grams, milligrams or puffs the patient should take each day. To utilize this prescribing information, the Clinical Practice Research Datalink developed an algorithm to derive a numerical value from the free text and provide this to researchers; [4] this is referred to within the Clinical Practice Research Datalink as the numeric daily dose (NDD). The dose (in milligrams or micrograms) per tablet or puff is typically contained in the product name.
The completeness of each variable is described in table A1 below.

Data cleaning
We carried out a series of data-checking and data-cleaning tasks including; checking the accuracy of NDD for the 500 most commonly occurring free-texts; extracting data for pack type from other variables (eg quantity variable); checking the clinical and referral records of 20 randomly selected patients to check the duration and dose was consistent with the clinical picture; excluding implausible values of quantity and NDD.
As NDD was not always available it was necessary to impute missing data. A "hot-deck" style imputation method was adopted, which replaced missing data with comparable data from the same set. An algorithm was developed which reviewed each oral corticosteroid and other immunosuppressive therapy record and imputed missing values. An extra binary variable for quantity was created, categorising quantity about the median (42 tablets for steroids, 36 for other immunosuppressants) into low and high. If a patient had any other record with the same quantity and dose, the median NDD among those records was used where NDD was missing. If a patient had no recorded NDD but had any other record of the same dose and quantity as a binary variable, the median NDD among those records was used. If a patient did not have a recorded NDD or quantity, but had records for the same dose, then the median NDD among those records was used. If there was no record of NDD, dose or quantity, but there were other patients in the dataset in the same 5- year age band, of the same gender, with the same dose and quantity, the using median NDD for those records was used. Finally, if none of the above were possible, patients in the dataset in the same 5-year age band, of the same gender, with the same dose and quantity as a binary variable, the median NDD among these records was used.
Pack type for inhaled corticosteroids was imputed using the most common pack type for the quantity and dose of each prescription. Where NDD was missing for inhaled corticosteroids, the median value of 4 puffs per day was used.

Calculating duration and dose
Using this information we calculated duration of oral corticosteroid or other immunosuppressive therapy prescription as follows: total quantity of tablets prescribed / NDD. Duration of inhaled corticosteroids was calculated: (quantity x pack type)/NDD. Dose was calculated as follow; NDD x dose per tablet or puff.
Algorithm to select smoking, alcohol and BMI status.
Data were derived from medical Read codes and data from the additional details file. Read codes classifying patients by BMI category are very rarely recorded, therefore were not used. Where patients had multiple recordings, the nearest status in the period -1y to +1month from index was taken (best); if not available, then the nearest in the period +1month to +1y after index was taken (second best); if not available, then the nearest before -1y from index was taken (third best); if not available, then take nearest after +1y from index was taken (least best).

B: Dealing with missing data
We used multiple imputation to account for missing data. Missing data was present for alcohol and smoking. In total, 89% percent of patients had complete data for all variables. To maximise the use of the data while properly incorporating the extra uncertainty arising due to missing data, multiple imputation by chained equations [5] was used to impute missing values for alcohol and smoking from multinomial models. The imputation model included all covariates from the main outcome model, together with the matching variables age and sex. We also included extra comorbidities, identified using medical Read codes, to look for additional markers of alcohol or smoking related diseases. These included: stroke, peripheral artery disease, angina (stable and unstable), acute coronary syndrome, congestive heart failure, myocardial infarction, hypertension and alcoholic liver disease (including portal hypertension) and pancreatitis. Five imputed datasets were created and combined for analysis.
Distributions of imputed values were visually checked for comparability with the observed data.

Rationale
The definition of exposure to oral corticosteroids and other immunosuppressive therapy in the main analysis was derived from guidelines on zoster vaccine contraindications (a 14-day course of high-dose oral corticosteroids or other immunosuppressive therapies, within the month prior to index date). The vaccine contraindications suggest patients remain immunosuppressed for one month following the end of their prescription. However we acknowledge this definition may not capture all patients with immunosuppression due to these medications.
Our sensitivity analysis therefore defined exposure as anyone taking an oral corticosteroid or other immunosuppressant within 3 months prior to the index date, and placed no restrictions on duration or dose of prescription.

Results
A much higher number of patients were defined as exposed to immunosuppressive therapy using this broader criterion (table C1). The overall effect of oral corticosteroids and immunosuppressive therapies was slightly lower when using the 3-month definition compared to the vaccine contraindication definition, however the confidence intervals overlapped (table C1). There were no major differences in the effect of our main risk factors after adjusting for the broader definition of exposure to immunosuppressive drugs, compared to the main analyses (table C2).  D: Association of various risk factors with zoster, after additionally adjusting for patient-level socioeconomic status

Rationale
In the main analyses patients were matched on practice and thereby the analyses controlled for practice-level socioeconomic status. For patients registered at English practices and agreeing to their medical records being linked to other dataset, a patient level socioeconomic status score is available. Socioeconomic status (at the patient and practice level) is captured using quintiles of the Index of Multiple Deprivation score. At the patient level, the patient's home postcode is mapped at the lower level super output level to the corresponding 2007 IMD score; a low quintile represents the least deprived.

Results
In total, 427,689 (61.6%) patients had a patient-level socioeconomic status score. The results from our sensitivity analysis which additionally adjusts for patient-level socioeconomic status are shown in Table D1. There were no major differences compared to the main analyses.

Rationale
In the main analyses we only included "active" controls, by ensuring controls had at least one consultation anytime within an 18 month period around the index date. In these sensitivity analyses, we applied different definitions of "active" controls. First we ensured controls had a consultation anytime from 1 year prior, to 2 years following the index date. Second, we required controls to have a consultation in the three years prior to index date.

Rationale
We explored how frequently patients consulted the general practitioner as this may introduce ascertainment bias (i.e. patients visiting their general practitioner more frequently may be more likely to receive a zoster diagnosis). We calculated the mean yearly consultation rate prior to index date (by dividing the total number of face-to-face or telephone consultations during follow-up, by the total years of follow-up prior to index date) among patients with our risk factors of interest. We compared this to the mean consultation rate for epilepsy, to assess whether epilepsy patients had a similar likelihood of being diagnosed with zoster.

Results
The results are shown in Table F1