U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Dunn G, Emsley R, Liu H, et al. Evaluation and validation of social and psychological markers in randomised trials of complex interventions in mental health: a methodological research programme. Southampton (UK): NIHR Journals Library; 2015 Nov. (Health Technology Assessment, No. 19.93.)

Cover of Evaluation and validation of social and psychological markers in randomised trials of complex interventions in mental health: a methodological research programme

Evaluation and validation of social and psychological markers in randomised trials of complex interventions in mental health: a methodological research programme.

Show details

Chapter 3Therapeutic process evaluation

Introduction

Here we are concerned not with mediational mechanism but with characteristics of a therapeutic intervention (process variables) that might influence or be associated with the efficacy of the intervention. The assumption is that there exists treatment effect heterogeneity and that some of this heterogeneity might be explained by these process measures. The process measures are post-randomisation treatment effect modifiers; they are not, strictly speaking, treatment effect moderators, as these are assumed to be measured (or measurable, in principle) before allocation to or onset of therapy. Examples include the strength of the therapeutic alliance, fidelity to a given treatment manual (whether or not CBT, for example, includes pre-specified components such as problem formulation and the setting of homework between sessions).

What are the technical challenges?

First, the potential intervention effect modifier might be a process measure that is ascertained only in those participants who receive treatment. In other words, it is a variable that describes the characteristics of patients when receiving treatment and thus these values are missing for those in an untreated control group (fidelity of a patient’s therapy to a CBT protocol or strength of the therapeutic alliance are obvious examples). Second, it is likely that the process measures would be measured with a considerable amount of measurement error (rating scales for strength of the therapeutic alliance, for example, will have only modest reliability). Third, there are also likely to be hidden selection effects (hidden confounding). A participant may, for example, have a good prognosis under the control condition (no treatment). If that same person were to receive treatment, however, the factors that predict good outcome in the absence of treatment would also be likely to predict good compliance with the therapy (e.g. strength of the therapeutic alliance). Severity of symptoms, or level of insight, measured at the time of randomisation, for example, are likely to be predictors of both treatment compliance and treatment outcome. They are potential confounders. If we were to take a naive look at the associations between measures of treatment compliance and outcomes in the treated group we would most likely be misled. These associations would reflect an inseparable mix of selection and treatment effects (i.e. the inferred treatment effects would be confounded); we would not know whether those who did well did so because they responded well to the treatment or if they would have done well anyway. We can allow for confounders in our analyses, if they have been measured, but there will always be some residual confounding that we cannot account for. The fourth and final challenge to be considered here arises from missing data, not just missing outcomes but also missing process measures for at least some of patients receiving therapy; this is different from the missing process information for the control patients. In the latter, the information does not exist (therapeutic processes do not exist in the absence of therapy) but in the former it exists but is not measured or recorded. These data are unlikely to be missing purely by chance. Prognosis, compliance with allocated treatment and the treatment outcome itself are all potentially related to loss of data, which, in turn, leads to potentially biased estimates of treatment effects and estimates of the influences of treatment effect modifiers.

Notation

We randomise participants to receive treatment (e.g. psychotherapy plus routine care) or to be in the control condition (routine care alone). As an example, we will consider the therapeutic alliance (A) as the process variable under investigation. Dropping the subject-specific subscript for simplicity, for each subject we have a potential (possibly not observed) or observed measure of the following:

  • Z, treatment group – the outcome of randomisation (1 for treatment, 0 for control)
  • Y, observed outcome
  • Y(0), potential outcome under the control condition (no access to therapy)
  • Y(1,a), potential outcome under the treatment condition with resulting strength of alliance, a
  • X′ = X1, X2 . . . Xp, baseline covariates
  • X1Z, X2Z, . . . XpZ, baseline covariate by randomised treatment group interactions (products)
  • A, the strength of the therapeutic alliance (only observed in the treated group), with observed level a.

All baseline covariates are assumed to be available for every participant in the trial (irrespective of randomisation).

For the time being we assume that we have a complete data set (there are no missing values, other than the counterfactuals determined by the design, i.e. those in the treatment-free control group).

How not to do it: correlate process measure (A) with outcome (Y) in the treated arm (and completely ignore the control arm)

For the participants in the treatment arm, the individual treatment effect, Δ, is the difference Y(1,a) – Y(0). It follows that the outcome of treatment is given by

Y(1,a)=Δ+Y(0).
(22)

If we correlate Y(1,a) with background variables (e.g. with putative predictive markers) or with process measures recorded during treatment, thinking that this is examining explanations for treatment effect heterogeneity, then our thinking is flawed. Take the strength of the therapeutic alliance, for example. A trial participant receiving CBT may or may not be able to form a strong working relationship with his or her therapist. It is highly likely that a participant who is able to develop such a relationship is also likely to have had the better treatment-free outcome, Y(0). If this is the case, then we would see a correlation between outcome Y(0) and alliance, A, even when the treatment effect, Δ, is zero for all participants. It is therefore possible to demonstrate a strong relationship between treatment outcome Y(1,a) and alliance even when the intervention is ineffective. A similar example from the field of vaccine development is illustrated by Follmann.68 From the results of a randomised human immunodeficiency virus (HIV) vaccine trial, Follmann restricts his analysis to the treated arm and illustrates a strong relationship between the immune response to vaccination (the process variable) and subsequent resistance to HIV infection. He then points out that the ITT analysis of the data from both arms had demonstrated that the HIV vaccine was not effective. One plausible explanation is that the immune response (a correlate of protection) is related to the trial participant’s innate immunity (i.e. in the absence of vaccination), that is the treatment-free (vaccine-free) outcome. In trials of both vaccination and psychotherapy, correlations between treatment outcome and the treatment process measure are not robust indicators of the influence of the process measures on subsequent causal treatment effects. They are confounded by the unobserved (hidden) treatment-free outcome. Therefore, we need a different approach.

The causal (structural) model

The vital component of all our models is randomisation, which ensures that, conditional on observed baseline covariates, X, both counterfactual outcomes, Y(0) and Y(1,a), are independent of treatment allocation (Y(0), Y(1,a) ┴ Z|X):

E[Y(0)X,Z]=E[Y(0)X]andE[Y(1,a)X,Z]=E[Y(1,a)X].
(23)

We assume that randomisation (treatment allocation) has an effect on treatment received (if received and how much) and, in particular, that participants in the control arm do not get access to any treatment.

What is the treatment effect for the individual subject (Δ) in terms of their potential therapeutic alliance status, A? Here A is the strength of the therapeutic alliance that is measured on the receipt of treatment. What is the relationship between Δ and A? We consider a simple linear ‘dose’–response model:

Δ(A=a,X=x)=βz+βaa+ε,
(24)

where ε is an individual-specific contribution to the treatment effect (or gain) not explained by the model. We assume ε is uncorrelated with the therapeutic alliance and therefore that:

E[Δ(A=a,X=x)]=βz+βaa.
(25)

Note that there is an intercept term (the ATE is not necessarily zero at zero alliance). Note that, for a treated individual, Δ|(A = a, X = x) = Y(1,a) – Y(0) and therefore:

Y(0)=Y(1,a)Δ(A=a,X=x)=Y(1,a)βzβaaε.
(26)

Given randomisation:

E[(Y|Z=0)]=E[(Y(0)|Z=0]=E[(Y(0)].
(27)

Similarly:

E[(Y|Z=1)]=E[Y(1,a)|Z=1]=E[Y(1,a)]=E[(Y(0)]+βz+βaa=E[(Y|Z=0)]+βz+βaa.
(28)

It follows from these equalities that, if we were prepared to assume that βz = 0 (an exclusion restriction), then we could simply estimate the remaining parameter, βa, by dividing the effect of randomised treatment on outcome by the average alliance score in the treated. This is the classic IV estimator described in Chapter 1 (see Chapter 1, The complier-average causal effect). However, unfortunately, we are not in a position to safely assume that βz = 0 (the therapy may well have an effect at zero alliance) and we therefore have to recognise that we have an underidentified model (we have too many parameters to estimate given the availability of only two treatment means). We need information on potential predictors of the therapeutic alliance in order to proceed. This will be the focus of attention in Instrumental variable methods.

Before moving on, however, we should acknowledge explicitly that, unlike a more straightforward measure of compliance with treatment allocation, the number of sessions attended, for example the therapeutic alliance, is not a ratio-level measurement. A scale value of zero is arbitrary; it does not indicate zero alliance. Recognising this, it might make our parameters more interpretable if we were first to rescale our alliance measures. Scores on the anglicised and simplified version of the California Therapeutic Alliance Scales (CALPAS),69 as used in Dunn and Bentall,14 for example, range from a minimum of 0 to a maximum of 7. Dunn and Bentall transformed these scores so that they ranged from a minimum of –7 to a maximum of 0. Accordingly, the parameter βz is now a measure of the average therapeutic effect in those participants with an optimal alliance with their therapist. The value and interpretation of the second parameter, βa, is unaffected by this simple change of scale (a shift of location only).

Instrumental variable methods

We have stated that we need (good) predictors of the therapeutic alliance in order to make progress (i.e. to attain identifiability). We illustrate this using a simple form of G-estimation, as described by Fischer-Lapp and Goetghebeur.50 First we regress the alliance (A) on baseline covariates, X, some of which are known (or hoped) to be predictors of the alliance. This regression model is then used to predict the alliance for everyone in the trial (both treated and the control patients). Second, we regress outcome in the treated group on the baseline covariates, X, and the outcome under treatment [Y(1,a)] is predicted from this model, again for everyone in the trial. Similarly, we regress outcome in the control group on the same baseline covariates, X, and the outcome under control conditions [Y(0)] is again predicted for everyone in the trial. The individual treatment effects, Δ, are then calculated from the difference in the predicted treatment outcomes and predicted control outcomes and, finally, the predicted Δ is regressed on the predicted alliance. The slope provides us with an estimate of βa and the intercept is an estimate of βz. In this last regression we are estimating the effect of the randomised therapy [i.e. Y(1,a) – Y(0)] conditional on the level of the therapeutic alliance predicted by the baseline covariates. There are no baseline covariates in this last model; we are assuming that all of the moderating influences of the baseline covariates are acting through their influence on the therapeutic alliance. Valid SEs, confidence intervals (CIs) and associated p-values can be obtained by bootstrapping the whole multistage procedure.

An alternative approach is to use IV methods and, in particular, the familiar 2SLS procedure. First, we assign an arbitrary value to the level of the missing therapeutic alliance in the control group. Here we assign a value of zero for everyone. The first-stage regression then involves the prediction of the therapeutic alliance from randomisation (i.e. treatment group), baseline covariates, X and all interactions between randomisation and each of the baseline covariates. As the measured level of alliance has been fixed at a constant value (zero) for everyone in the control group, there will be no effect of any of the covariates on alliance in this group. However, the effects of the covariates on alliance will be free to be estimated in the treated participants (exactly as at the corresponding stage of the G-estimation algorithm). The second-stage regression then simply involves the prediction of outcome by randomised treatment, the predicted level of alliance and the baseline covariates (there are no treatment by baseline interactions). The treatment by baseline covariates in the first-stage model are IVs (assumed to have an effect on alliance but no effect on outcome that is not explained by their effect on the alliance). Theoretical details are provided by Dunn and Bentall.14 If there are no missing data, except missing process measures in the control group (i.e. every case is complete), this 2SLS procedure will give identical estimates to those provided by the above G-estimation algorithm.14

Although conceptually very different, the practicalities of use of 2SLS methods are the same for the investigation of therapeutic processes (the effects of post-randomisation effect modifiers) as those described in the previous chapter to investigate treatment effect mediation. In some cases this conceptual distinction is just a reflection of different ways of thinking about the problem. On the one hand, dose of therapy (number of sessions attended), for example, can be thought of as a mediator of the offer of treatment (and, with an assumed exclusion restriction, no therapeutic effect when no sessions attended and complete mediation of the offer of treatment on outcome). On the other hand, dose of therapy can be considered as a post-randomisation modifier of the ITT effect: the ITT effect increasing with increasing dose. Comparison of 2SLS approaches and principal stratification for the estimation of the CACE also illustrate this point.

Binary process measures: principal stratification

It is possible, and quite straightforward, to use 2SLS estimation methods when the process measure is binary (high vs. low alliance; problem formulation vs. no formulation; homework vs. no homework). The validity of the method is not dependent on the process measure being normally distributed, for example, or actually a quantitative variable. The alternative IV estimator described in Chapter 2, based on the use of the compliance score as an instrument, would be equally applicable. Here, however, we approach the problem through latent class models: principal stratification. We are concerned with the natural extension of the use of latent classes in CACE estimation (compliers vs. non-compliers) through simply relaxing the exclusion restriction on the stratum equivalent to the non-compliers (e.g. low therapeutic alliance). In summary, we assume that we have two partially observed strata (low and high alliance); we can observe alliance status in the treated group and assume that there are, on average, the same proportions of the two classes in the control arm. We are concerned with the estimation of two ITT effects: that in the low-alliance stratum and that in the high-alliance stratum. The overall ITT effect is a weighted average of these two stratum-specific ITT effects. Staying with high versus low therapeutic alliance, we have:

ITToverall=PHITThigh+(1-PH)ITTlow,
(29)

where PH is the proportion of the high-alliance stratum. It should be immediately clear from this equation that, again, this simple model is underidentified. We cannot estimate ITThigh and ITTlow from a knowledge (estimate) of PH and ITToverall. In CACE estimation we assume that one of the stratum-specific ITT effects is zero. This is not justified here. Instead, we attain identifiability by using baseline covariates that are (good) predictors of principal stratum membership (i.e. predictors of observed stratum membership in treated participants). This is entirely analogous to the search for effective covariate by treatment interactions to use as instruments in 2SLS; we additionally assume that there are no covariate-by-treatment interactions in the model predicting outcome within each of the two strata.70,71

In principle, principal stratification can easily be extended to cope with categorical process measures with three or more unordered categories. Dunn et al.,13 for example, considered non-compliers (those who never turned up for CBT), participants who attended (or would have attended) their CBT sessions but did not receive the CBT as intended (being more akin to supportive listening) and participants who received (or would have received) CBT as intended. Here it was thought to be legitimate to introduce the exclusion restriction for the non-compliers but the model was still underidentified. Once again, identifiability was achieved by being able to predict stratum membership using baseline covariates. These authors also allowed for missing process measurements in the treated (CBT) arm by extending the latent class modelling to allow for the process measure to be latent in both arms of the trial. We will not pursue these technical details but stay with the simpler binary process measure, assumed to be available for everyone in the treated group.

Missing outcome data

All psychotherapy trials have some missing outcome data, some quite a lot. In the Outcome of Depression International Network (ODIN) trial of psychotherapy for depression, for example, only 74% of the randomised participants provided outcome measures at 6-month follow-up. Even more striking was the dependence of the follow-up rates on the compliance (process measure) status of the participants: 73% for the control participants, 92% for the compliers in the treatment arm and only 55% for the non-compliers in the treatment arm (the non-compliers representing 46% of those allocated to receive treatment).71,72 It is highly likely that levels of missing outcome data will depend on other process measures such as the strength of the observed therapeutic alliance in the treated group and/or the potential therapeutic alliance in the controls (i.e. that alliance that would be observed had the control participant received treatment). Principal stratification using latent class models fitted by maximum likelihood will automatically allow for missing data patterns that are determined by observed alliance in the treated arm. Simultaneously fitting the two stages of the conventional IV model using maximum likelihood will also allow for this contingency; both are allowing for missing outcomes to be missing at random (MAR, in the terminology of Little and Rubin73). In the context of principal stratification it is also straightforward to allow for a non-ignorable missing data mechanism, missingness being determined by principal stratum membership, called ‘latent Ignorability’ by Frangakis and Rubin.74 This will be illustrated in our case study in the next section.

What about the conventional use of 2SLS? If a 2SLS command is used, the analysis is based on only those participants with complete data (the so-called complete-case analysis). To allow for loss to follow-up that is dependent on the observed process measure (compliance with treatment allocation, strength of the therapeutic alliance and so on) we need to carry out the two stages of the two-stage estimation separately. The first (prediction of the process measure) uses everyone randomised and involves saving the predicted values of the process variable. The second involves only those with non-missing outcomes but models these using the predicted process measures obtained from the complete first stage. Bootstrapping of the whole two-stage procedure provides valid SEs, CIs and corresponding p-values. An alternative (and more or less equivalent) approach would be to stick with the complete-case procedure but supplement the command by declaring inverse probability weights determined by modelling loss to follow-up using baseline covariates and the observed process measures.

What about missing process measurements? In many trials (to date) collection of data on process measurements has not been seen as a vitally important aspect of their implementation. Often, collection of process data has been part of a supplementary ‘add-on’ project. So, in practice, there is quite a lot of missing process information. How should this affect our approach to analysis? One solution is to simply ignore the participants for which the process measurements are missing, that is drop them from the data file. Dunn and Bentall14 took this approach in order to simplify the analysis strategy (the aim of the paper being to explain and illustrate methodological developments and not to make any substantive claims concerning the role of the process measure, the therapeutic alliance in this case). Clearly, a more principled approach needs to be taken in an analysis undertaken to yield valid substantive conclusions. A detailed examination of approaches that might be taken (including multiple imputation) is currently being carried out by Lucy Goldsmith, a PhD student at the University of Manchester. The results of these investigations will be published elsewhere. Here we illustrate a simple solution, but not necessarily the optimal one, based on principal stratification. Principal strata are partially observed latent classes; in our example above, class membership is known for the treated group (e.g. high or low alliance) but latent in the control group. Following Dunn et al.,13 we include those from the treated group with missing alliance data but acknowledge that their stratum status is hidden, just as for the control group (assuming that conditional on the baseline covariates, the distribution of alliance status is not dependent on whether or not it is observed). We illustrate the details in the following section.

Case study

Here, we consider the Study of Cognitive Realignment Therapy in Early Schizophrenia (SoCRATES) trial, which was designed to evaluate the effects of CBT and supportive counselling (SC) on the outcomes of an early episode of schizophrenia. Participants were allocated to three conditions: CBT in addition to TAU, SC and TAU, or TAU alone. Recruitment and randomisation was within three catchment areas (treatment centres): Liverpool (centre 1), Manchester (centre 2) and Nottinghamshire (centre 3). In summary, 101 participants were allocated to CBT + TAU, 106 to SC + TAU and 102 to TAU alone. Of these, 225 participants (75% of those randomised) were interviewed at 18 months’ follow-up: 75 in the CBT + TAU arm, 79 in the SC + TAU arm and 71 in the TAU alone arm. The remaining participants died during the follow-up period (n = 7), withdrew consent (n = 4) or were lost (n = 73). Further details can be found elsewhere.75,76

The post-randomisation variable that has a potential explanatory role in the analysis of treatment effect heterogeneity is the measure of the quality or strength of the therapeutic alliance at the fourth session of therapy. Therapeutic alliance is a general term for a variety of therapist–client interactional and relational factors which operate in the delivery of treatment. It was measured at the fourth session of therapy because it was early in the time course of the intervention, but not too early to assess the development of the relationship between therapist and patient. (The alliance was also assessed at the tenth session, but we will not pursue this added complication here.) The strength of the therapeutic alliance was measured in SoCRATES using two different methods, but here we report the results from an anglicised and simplified version of the short 12-item patient-completed version of the CALPAS.69 Total CALPAS scores (ranging from zero, indicating low alliance, to 7, indicating high alliance) have been used in our previous analyses,14,31,32 but here we follow Emsley et al.30 by creating a binary alliance indicator (one if CALPAS score greater than or equal to 5, otherwise zero) and illustrate an analysis based on principal stratification.

The primary outcome measure in the trial was the Positive and Negative Syndromes Schedule77 (PANSS). The PANSS was administered at baseline, once a week over the first 6 weeks and then at 3 months, 9 months and 18 months. For the present purposes, only the initial (baseline) and 18-month PANSS total scores are considered. The initial PANSS score is considered as a baseline covariate in all analyses reported here. Other baseline covariates used in the analyses reported here are centre membership (binary dummy variables, C1 and C2), the logarithm of the duration of untreated psychosis (logDUP) and years of education.

Further details and the trial outcomes have been reported elsewhere.75,76 Briefly, from an ITT analysis, there was no evidence of an effect on speed of recovery over the first 6 weeks of treatment. However, at the 18-month follow-up, both psychological treatment groups had a superior outcome in terms of symptoms (as measured using the PANSS) compared with the control group, although there was no effect on relapse rates. There were no differences in the effects of CBT compared with SC, but there was a strong centre effect, with outcomes for the psychological therapies at one of the centres (Liverpool) being significantly better than at the remaining two.

For illustrative purposes, we here ignore the distinction between CBT and SC. Note that, as indicated above, not everyone in the treated groups provided data on the strength of their therapeutic alliance. In fact, 45% of the participants who were expected to provide CALPAS measures at the fourth session of therapy failed to do so. Table 4 provides a detailed summary of the results of the trial that are relevant to the present discussion. Results are shown separately for each of the three centres (Liverpool, Manchester and Nottinghamshire) and within each of these centres according to their treatment status: control group, treated group with observed low-alliance, treated group with observed high alliance and treated group with an unknown (missing) alliance. The proportion of participants with missing alliance measures varies with centre, as does the proportion of those observed with a high alliance. Ignoring missing data, Liverpool has the highest proportion of participants with a high alliance (74%), consistent with treatment being more effective in this centre.

TABLE 4

TABLE 4

Summary statistics from the SoCRATES trial

Here we illustrate the use of principal stratification to answer the question ‘is the treatment effect in the high-alliance class better than that for those in the low-alliance class?’. We estimate the SEs of our treatment effect estimates using asymptotic likelihood-based methods and through the use of the simple bootstrap78 (1000 replications). We assume that either outcome data are MAR or, alternatively, missing outcomes are latently ignorable (LI). Finally, we use two options for dealing with treatment participants with missing alliance data; by either restricting the analysis to those with non-missing alliance or including everyone in the analysis by coding missing class membership as unknown (as is the case for the control group, that is conditional on the relevant baseline covariates we are assuming that the distribution of strata is the same in the treated with observed alliance as in the treated with missing alliance, and, in turn, the same as that in the control group). Under each of these options, we examine whether or not it is reasonable to assume that the treatment effect in the low-alliance group is zero (i.e. introduce the exclusion restriction as the result of empirical findings rather than as an a priori assumption).

In all of these analyses we use Mplus version 7.79 The input file for the case when missing outcomes are assumed to be latently ignorable is given in Appendix 2 (we do not expect readers who are unfamiliar with Mplus to be able to follow the content of this file but include it as an exact record of what Mplus was instructed to carry out). We will now explain what the analysis entails. We specify that we have two latent classes (principal strata: high and low alliance) and that their distribution is to be estimated through a finite mixture model. Class membership is predicted by a logistic regression with baseline total PANSS score, years of education, logarithm of duration of untreated illness and centre membership (two binary dummy variables, c1 and c2) as covariates. These covariates were selected informally using the analyst’s judgement. A detailed examination of approaches that might be taken (including familiar forward or backward selection methods and the use of the various penalised alternatives) is currently being carried out by Clare Flach, a PhD student at the University of Manchester. The results of these investigations will be published elsewhere. To carry out this finite mixture modelling, observed data on the therapeutic alliance (the so-called training set) are coded by two binary dummy variables, a1 and a2. Low alliance is indicated when a1 = 1 and a2 = 0; high alliance when a1 = 0 and a2 = 1. If alliance is unknown (as in the control or treated participants with missing alliance ratings) then Mplus expects the coding a1 = 1 and a2 = 1. Simultaneously, the effect of randomised treatment allocation (i.e. the ITT effect) is estimated within each of the two classes using an analysis of covariance model in which the effects of the covariates (the same as those included in the model to predict stratum membership) are constrained to be equal for the two strata (i.e. ensuring no covariate by alliance interaction). If the input program makes no reference to the missing outcome data [i.e. a variable here called ‘resp’ (short for ‘response’); with resp = 1 if outcome is observed, 0 otherwise] then we are assuming that outcome is MAR, that is conditional on covariates, the probability of being missing may differ between the high- and low-alliance groups in the treatment arm, but is assumed to be homogeneous in the control group. Under the LI assumption we introduce logistic regression models to predict response by randomised treatment allocation and the other covariates within each of the two strata separately (again constraining the covariate effects other than treatment to be the same within the two strata). So, here, the probability of having a missing outcome is dependent upon latent class (principal stratum) membership rather than just observed treatment status as in MAR; hence the term ‘latent ignorability’.

We carry out a sequence of model fitting. First we use the data from the control group and only those of the treated participants who have a recorded alliance measure. All included participants may or may not have missing 18-month outcome (PANSS) data. We estimate the within-stratum treatment effects together with their SEs in Mplus using an expectation and maximisation (EM)–maximum likelihood algorithm. We then repeat the same analysis but estimate the SEs using the bootstrap. Next, we carry out a third run having constrained the treatment effect in the low alliance stratum to be zero (again using the bootstrap to estimate the SE of the treatment effect in the high alliance stratum). All three runs are carried out twice: once assuming that outcomes are MAR and again assuming that they are LI. The six sets of results are presented in the Excluding participants with missing alliance values section of Table 5. We now repeat the whole exercise including the treatment group participants with missing alliance data. The second six sets of results are presented in the Using all data: including treated participants with missing alliance values section of Table 5.

TABLE 5. Principal stratification in SoCRATES: treatment effect modification by therapeutic alliance (effect estimates and their SEs).

TABLE 5

Principal stratification in SoCRATES: treatment effect modification by therapeutic alliance (effect estimates and their SEs). Estimated ITT effects on 18-month PANSS total scores

What can we conclude from these results? We conclude that:

  1. All treatment effect estimates are imprecise.
  2. We gain a little more precision by including treatment participants with missing alliance status.
  3. It makes little difference whether we are assuming that missing outcome data are MAR or LI.
  4. The conventional likelihood-based SE estimates appear to be too optimistic. It is much safer to use non-parametric bootstrapping.
  5. There is a beneficial effect of treatment in the high-alliance stratum but not in the low-alliance stratum. In fact, there is a suggestion that treatment might be detrimental in the latter (but the effect is not statistically significant).
  6. We have not yet established that the treatment effects differ in the two strata, that is that there is evidence of effect modification. This we now proceed to do.

We now look at contrasts between the estimated treatment effects for the two principal strata. Considering the results after excluding trial participants with missing alliance data, the estimated difference between the treatment effect estimates for the two strata is –15.97 – 7.58 = –23.55 with a bootstrapped SE of 13.25 (p = 0.076) when assuming that missing outcomes are MAR. The difference is –17.52 – 5.64 = – 23.15 with a bootstrapped SE of 13.97 (p = 0.097) when assuming that missing outcomes are LI. If we analyse the full data set (including treatment participants with missing alliance measures), the corresponding estimates after assuming either MAR or LI are, respectively, –16.58 – 9.27 = –25.85 with a bootstrapped SE of 9.71 (p = 0.008) and –17.18 – 8.75 = –25.93 with a bootstrapped SE of 11.69 (p = 0.027). The gain in precision attained by including all of the participants now appears to be considerable. We cautiously conclude that the therapeutic alliance does modify treatment efficacy.

Reflections

It appears that what we need is a baseline variable that is a powerful predictor of the therapeutic process indicator in the treatment group (adherence to prescribed number of sessions of therapy, strength of the therapeutic alliance, fidelity to a CBT treatment manual, etc.). This is to predict the level of the process indicator for those in the control group; without one or more good predictors we cannot do this accurately and would effectively be predicting random values.

How do we design a trial in which we can use this predictor? Design issues are dealt with mainly in Chapter 5 but we here briefly consider two possible options. In a different context, Follmann68 was concerned with what he termed ‘augmented’ designs to assess immune response in vaccine trials (the immune response only being observed in the participants receiving the vaccine). He suggested two designs.

Follmann’s first design involved vaccinating, prior to randomisation, all of the participants recruited to the trial with an irrelevant vaccine (e.g. against rabies) and measuring the immune response to this vaccine. This response is assumed to be highly correlated to the subsequent immune response to HIV vaccination in those participants who then go on in the actual trial to receive the HIV vaccine. The implication is also that the response to the rabies vaccine is a strong predictor of the missing HIV response in the control participants (i.e. the immune response that would have been produced in the control participants had they (contrary to fact) been allocated to receive the HIV vaccine. In the context of our psychotherapy trials, an equivalent two-stage design might involve measurement of the strength of the therapeutic alliance (or quality of the more general working relationship) with, for example, the participant’s case manager prior to randomisation in the trial itself. With luck, the case manager alliance by treatment interaction would provide a powerful IV in the adjustment for hidden confounding of the effect of the intervention’s therapeutic alliance in the trial itself.

Follmann’s second design involved the use of what psychologists refer to as a waiting-list control group. The randomised HIV vaccine trial is a carried out as a standard parallel-group study (the immune response being measured in the vaccinated participants, and outcome assessed in everyone). At the end of a specified follow-up period, the control participants are given the vaccine against HIV and their immune response measured. Further follow-up is unnecessary. The immune response in the control participants at follow-up is used to predict the immune response that would have been observed had they been vaccinated at the beginning of the trial. In principle, this waiting-list controlled design might be feasible in a psychotherapy trial, but only if we can be confident that the majority of the control participants would not recover in the absence of therapy. This is quite a stringent condition that would not hold in many cases.

In Chapter 2, we have indicated that measurement error in the intermediate variable might be of more significance than hidden confounding. Clearly, process measures such as the strength of the therapeutic alliance will be subject to considerable error (no one would ever claim that they are infallible). Although IV methods adequately adjust for random measurement error, as well as for hidden confounding,14,59,70 the use of multiple indicators of the therapeutic alliance, for example, will help to improve the precision of the required causal parameter estimates58,60,80 (see the example using the PACT data at the end of the previous chapter). Similar use of multiple indicators would help to refine the definition of stratum membership for principal stratification; clearly, as it is used above, there will be misclassification errors and these will dilute (attenuate) differences between the ITT effects within strata.

Copyright © Queen’s Printer and Controller of HMSO 2015. This work was produced by Dunn et al. under the terms of a commissioning contract issued by the Secretary of State for Health. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.

Included under terms of UK Non-commercial Government License.

Bookshelf ID: NBK326950

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (5.5M)

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...