Parameter estimates for trial arm, time, and their interaction, per protocol analysis
| Outcome measure | Complete case cohort (n=633) | Available case cohort (n=1108) | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Trial arm* | Time† | Time×trial arm | Trial arm* | Time† | Time×trial arm | ||||||||||||
| Estimate (SE) | P | Estimate (SE) | P | Estimate (SE) | P | Estimate (SE) | P | Estimate (SE) | P | Estimate (SE) | P | ||||||
| PCS (US 1998 NBS) scale | −1.17 (3.82) | 0.761 | +0.01 (0.43) | 0.975 | +2.55 (3.58) | 0.476 | +1.04 (2.66) | 0.696 | −0.04 (0.40) | 0.926 | −0.02 (2.96) | 0.994 | |||||
| MCS (US 1998 NBS) scale | +1.74 (4.91) | 0.724 | −0.87 (0.53) | 0.105 | −10.88 (4.37) | 0.013 | +4.54 (3.38) | 0.179 | −0.63 (0.51) | 0.217 | −7.07 (3.73) | 0.058 | |||||
| EQ-5D scale | −0.12 (0.12) | 0.316 | +0.02 (0.02) | 0.323 | +0.14 (0.12) | 0.264 | −0.05 (0.09) | 0.573 | +0.01 (0.01) | 0.357 | +0.08 (0.10) | 0.445 | |||||
| Brief STAI scale | −1.93 (1.76) | 0.273 | −0.15 (0.21) | 0.463 | +3.10 (1.70) | 0.129 | −1.29 (1.28) | 0.315 | −0.11 (0.19) | 0.564 | +1.62 (1.43) | 0.258 | |||||
| CESD-10 scale | −2.63 (2.51) | 0.294 | −0.02 (0.27) | 0.945 | +6.41 (2.37) | 0.007 | −2.53 (1.73) | 0.145 | −0.12 (0.26) | 0.636 | +3.65 (1.96) | 0.062 | |||||
PCS=physical component score; MCS=mental component score; NBS=norms based scoring; SE=standard error.
Data are based on multilevel models controlling for baseline outcome score, all covariates and intraclass correlation. No specific hypotheses were made about the effect of telehealth on particular outcomes at particular time points; therefore, any investigation of time×trial arm interaction terms must be considered exploratory (hypothesis generating) rather than confirmatory (hypothesis testing). The value afforded to such findings when drawing inferences must be weighted accordingly. Moreover, sensitivity analyses across multiple outcomes, cohorts, analytical approaches (intention to treat v per protocol), and parameters (trial arm, time, trial arm×time) leads to the reporting of 60 significance tests (tables 2 and 3). At the stated α level of 0.05, we would expect three of these to be significant by chance alone, while reducing α to 0.01 would render one of the two significant interaction term in table 3 (complete case cohort) non-significant. The lack of significant interaction terms in the primary analyses (for both cohorts) and secondary analyses (available case cohort) highlights the general lack of robustness. Furthermore, trial arm×time interaction terms were not significant for PCS, EQ-5D, or CESD-10 in table 3 despite ostensibly measuring closely related constructs. When a trial produces overwhelmingly null results, there is a danger of overemphasising any significant findings, but consideration of the salient factors shows that the two significant interaction terms are not robust, with reasonable likelihood that they reflect chance effects resulting from the additional inclusion criteria applied in the secondary analyses. They should be interpreted with caution.
*Telehealth=0; usual care=1 (reference category).
†Short term assessment (at four months)=2, long term assessment (at 12 months)=3 (reference category). The only a priori hypothesis made about telehealth was that it would improve health related QoL and psychological outcomes relative to usual care.