Included under terms of UK Non-commercial Government License.
NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Dunn G, Emsley R, Liu H, et al. Evaluation and validation of social and psychological markers in randomised trials of complex interventions in mental health: a methodological research programme. Southampton (UK): NIHR Journals Library; 2015 Nov. (Health Technology Assessment, No. 19.93.)
Evaluation and validation of social and psychological markers in randomised trials of complex interventions in mental health: a methodological research programme.
Show detailsThis report describes the development, evaluation and dissemination of statistical and econometric methods for the design of explanatory trials of psychological treatments and the explanatory analysis of the clinical end points arising from these trials. We have been concerned with making valid causal inferences about the mediational mechanisms of treatment-induced change in these clinical outcomes. In Chapter 1, we identified four questions about complex interventions/treatments. We present these questions again, and relate these to the methods we have discussed in this report.
Does it work?
In Chapter 1, we described the fundamental concepts of causal inference and how this provides estimators of treatment efficacy (the ATE) which underpin randomised trials. In the presence of non-compliance, or departures from randomised treatments, we identified an alternative estimator, the CACE, and described the necessary assumptions to identify the CACE.
How does it work?
In Chapter 2, we discussed the statistical evaluation of treatment effect mechanisms through mediation analysis in some detail, starting with long-established strategies from the psychological literature, with the possibility of using prognostic markers for confounder adjustment. We introduced definitions of direct and indirect effects based on potential outcomes (counterfactuals), together with some appropriate methods for their estimation, and then introduced IV methods to allow for the possibility of hidden confounding between mediator and final outcome. In Chapter 4 we extended the ideas to cover trials involving longitudinal data structures (repeated measures of the putative mediators as well as of clinical outcomes).
What factors make it work better?
In Chapter 3, we outlined the usual naive approach to evaluating the modifying effects of process measures (correlating their values with clinical outcomes in the treated group with no reference to the control group) and then describe modern methods developed from the use of IVs and principal stratification. These methods evaluate the modifying effect of the process variable on the treatment effect, rather than the prognostic effect on the outcome as the naive methods estimate. In Chapter 4 we extended the ideas to cover trials involving longitudinal data structures (repeated measures of the process measures as well as of clinical outcomes).
In Chapter 5, we considered the challenge of trial design in the context of the use of IV methods and principal stratification to answer the questions about treatment effect mechanisms and process measures. These designs included using predictors of outcome as instruments, using moderators of treatment effects to generate instruments, using simple multiarm trials and using data from parallel trials with single or parallel mediators.
Who does it work for?
During the programme of research, we focused increasingly on the question of who treatments work best for and the idea of targeting the right treatment to the right patient at the right time. This concept has been elucidated throughout the report. For example, the use of moderators as instruments essentially involves identifying subgroups of individuals for whom the treatment is thought to be most efficacious, exploiting treatment effect heterogeneity in the data. We considered the role of targeted therapies, multiarm trials and the use of parallel trials to help elucidate the evaluation of mediators working in parallel. We gave particular attention to the role of stratification (based on treatment effect moderators or predictive markers) in the evaluation of treatment effect mechanisms motivating the development of personalised therapies. In Chapter 5, we introduced our new proposed BS-EME trial design as one contribution to the stratified medicine literature, although uniquely with a focus on testing the underpinning mechanism of the stratification.
Examples
We have primarily used psychological or psychosocial treatment trials as our motivating examples throughout the report. We make no apologies for focusing on mental health, because the challenges provided in this area are considerable and in many cases led directly to the methodological research presented. This is especially true for issues of obtaining reliable measurements of mediators (and outcomes), of confounding between the mediator and outcome, and of measurement error in the mediator.
However, it is worth noting that the problems identified with measurement (e.g. of process variables or of potential mediators) are not exclusive to mental health and are both shared and common to other areas of health research. We would argue that most, if not all, self-reported measures of health-related variables (such as pain intensity, diet, sleep, fatigue, etc.), are similarly problematic and even biological markers or clinically observed variables such as blood pressure and cholesterol are not immune to measurement problems. We do not imply that all the problems and challenges identified with trials of complex interventions are associated only with psychological interventions: rather, these are generic problems, and mental health is one discipline that has paid considerable attention to these issues.
Concluding tips for Efficacy and Mechanism Evaluation triallists
In order to demonstrate both efficacy and mechanism, you need to:
- 1.
demonstrate a treatment effect on the primary (clinical) outcome
- 2.
demonstrate a treatment effect on the putative mediator (mechanism).
These two steps are necessary but not sufficient to demonstrate a causal pathway from treatment to mediator to outcome. You might proceed to:
- 3.
Evaluate the correlation between mediator and outcome (possibly conditioning on treatment arm). But beware, the correlation can arise from (1) the effect of mediator on outcome; (2) the effect of outcome on the mediator (perhaps unlikely if the treatment is primarily targeted on the mediator; or (3) a common cause other than treatment (confounding). The effects are not, of course, mutually exclusive. We may, for example, have both (1) and (3). In this case, our aim is to evaluate (1) in the presence of (3).
The common causes of the mediator and outcome may be characteristics of the trial participant prior to treatment (i.e. potential covariates or prognostic markers that could, in principle, be measured prior to randomisation). There could also be common causes (such as comorbidity, life events, etc.) that arise after the onset of treatment. The latter are much more difficult to handle. Instead of simply correlating mediator and outcome, you would be better using a regression model to predict outcome by both levels of mediator and treatment arm (as in B&K16). This would be preferable to a correlational analysis if the mediator is binary rather than quantitative. The natural extension to this would then be:
- 4.
Regress outcome on mediator and treatment, allowing for all measured baseline covariates that may possibly be of prognostic value (do not bother about the statistical significance of their effects, include them regardless).
You will very rarely, if ever, be in a position to confidently claim that you have allowed for all common causes. Some may be impossible to measure and others you may not even have thought of. Many of the covariates in your regression model will be subject to measurement error. Your model will be an improvement on a simple correlation or linear regression of outcome on mediator and treatment (without covariate adjustment) but it will not lead to a complete elimination of biases.
Now is the time to try allowing for unmeasured common causes (hidden confounders) through:
- 5.
An IV model (e.g. using 2SLS). The instrument is assumed to be strongly related to the mediator but statistically independent of outcome conditional on both the mediator and the common causes. Allowing for the measured confounders (baseline covariates) in both stages of the two-stage IV procedure will improve the precision of the causal effect estimates. But beware, convincing instruments are difficult to find.
The key to finding useful and convincing instruments appears to be treatment effect heterogeneity. Here we have access to treatment effect moderators, the effects of which can be observed in terms of their influence of treatment effects on both the proposed mediator and the final outcome (if these effects are not observed in our trial then this approach is not going to be fruitful). But we need, in addition to this, an assumption that the moderation of the treatment effect on outcome is wholly explained by the moderation of the treatment effect on the mediator (treatment effect mechanism). This depends on convincing prior biological or psychological theory (and possibly evidence from earlier experimental investigations) concerning the targeted nature of the intervention and convincing theory justifying the role of the moderator (predictive marker) in the construction of a valid instrument. Careful design is essential. Here we pursue a line of thought that is different to, but fully consistent with, the development of the argument above.
The role of efficacy and mechanism evaluation in the development of personalised therapies (stratified medicine)
We conclude with a series of simple statements102 aimed at encouraging triallists to seriously consider the role of markers that enable the simultaneous evaluation of the utility of a putative predictive biomarker and the treatment effect mechanisms motivating its use.
- Personalised therapy (stratified medicine) and treatment effect mechanisms evaluation are inextricably linked.
- Stratification without corresponding mechanisms evaluation lacks credibility.
- In the almost certain presence of mediator–outcome confounding, mechanisms evaluation is dependent on stratification for its validity.
- Both stratification and treatment effect mediation can be evaluated using a biomarker-stratified trial design together with detailed baseline measurement of all known prognostic biomarkers and other prognostic covariates.
- Direct and indirect (mediated) effects should be estimated through the use of IV methods (the IV being the predictive marker by treatment interaction) together with adjustments for all known prognostic markers (confounders), the latter adjustments contributing to increased precision (as in a conventional analysis of treatment effects) rather than bias reduction.
Role of therapeutic process evaluation
In many ways this faces the same conceptual and technical problems as the evaluation of mediation. Both are concerned with the investigation of therapeutic mechanisms. Both mediators and indicators of the therapeutic process are subject to measurement errors and it is very likely that their effects are subject to confounding. Both might be the focus of personalised therapies. The difference is that the process variables are not measured (not defined) in the absence of therapy. One solution, in addition to use of the more familiar IV methods, is to introduce the use of principal stratification. Do not just look at associations between process and outcome in the treated participants. This is flawed logic.
Recommendations for research
We end with some recommendations about future research in this area, building on the methods described in this report. It is important to note that this is a growing field which is developing rapidly, particularly in the context of personalised therapies and causal mediation analysis, and these recommendations do not include every aspect of these fields.
Linking efficacy and mechanism evaluation explicitly
In Chapter 1, we focused on treatment efficacy before moving on to consider treatment effect mechanisms and process variables separately. However, in practice, these are strongly linked. For example, if a client does not attend any sessions of therapy, how would we expect his or her mediator to change as a result of the random allocation alone? If the client does not attend therapy, how can we measure a therapeutic alliance? Conversely, perhaps the therapeutic alliance is low and this leads to poor attendance at therapy? It seems logical that, if our aim is to conduct a thorough explanatory analysis of these trials, we need to consider models with non-compliance and mediation jointly. We plan to explore this by considering compliance and the mechanism as causally ordered mediators in the framework of Daniel et al.105
Design of trials for efficacy and mechanism evaluation and implications for sample size
Are the sample sizes derived from powering a trial for the primary ITT analysis sufficient to permit the use of the methods described in this report? We have not explicitly researched the implications for sample size and so cannot give a definitive answer. For example, we have shown that IV procedures decrease the precision of the estimates relative to standard regression approaches. This is the usual trade-off between bias and precision. Further research could investigate the power to detect mediation effects of a specific magnitude using IV procedures.
In our BS-EME simulations, we noted the role of prognostic markers in increasing the precision of the estimates (i.e. lowering the SEs). It is known from the ITT analysis of randomised trials that including pre-specified prognostic variables in the outcome regression model gains precision of the treatment effect estimate. What is interesting in the context of mediation is that this gain is with respect to all the parameters of interest. This implies that, given a fixed sample size, the reduction in statistical power for mediation effects from using an IV approach will be lessened by inclusion of prognostic covariates.
Similar issues arise in the analysis of process variables using the principal stratification. While the key identifying aspect is the availability of strong baseline predictors of class membership in the treatment group, the sample size must be large enough to accommodate this modelling approach.
We identified a number of approaches to improving the design of EME trials, but the next stage is the adoption of these in applications. Designing the perfect EME trial is difficult, if not impossible, as investigators will always be in a position of having to make unverifiable assumptions. However, while there will always be a possibility of unmeasured confounding, recording as many confounders as possible and allowing for them in the subsequent statistical analyses will always help. Considering the recent work by Imai et al.103,104 on the design of experiments for mediation analysis, being able to apply their designs to controlled clinical trials would be a major step forward.
Measurement of mediators (reliability and measurement error)
As we have highlighted, measurement is the key issue in many areas of clinical research, not just psychology and mental health. Likewise, obtaining reliable, reproducible and valid measures without systematic measurement error can be a challenge, but one which any mechanisms evaluation ultimately relies on. As we described in Chapter 2, Pickles et al.80 proposed a solution to account for measurement error in the mediator by making use of the repeated measures of the mediator in the UK PACT. Valeri et al.106 have proposed an alternative approach when the mediator is continuous for any outcome under a generalised linear model. More research is needed into the impact of measurement error in the mediator and solutions for this problem.
Other forms of outcome variable
Many of the methods we have introduced in this report are based on linear models. On one hand, this is natural, as continuous measurement scales are common in mental health and psychology, and this is the discipline our examples are drawn from. On the other hand, the generalisability of the methods to other clinical areas may therefore be limited. For example, there is little literature on mediation analysis with survival outcomes107 and particularly with IVs estimation for survival outcomes. Future research should seek to generalise the methods to alternative forms of outcomes.
Sensitivity analysis
VanderWeele108 and VanderWeele and Arah109 have proposed general methods estimating bias formulas for sensitivity analysis for unmeasured confounding for direct and indirect effects. Given that we have identified unmeasured confounding as a key limitation in both the more traditional approaches to mechanisms evaluation and some of the more advanced approaches, the application of these and other proposed sensitivity analysis techniques to our clinical examples and motivating questions should be investigated.
- Does it work?
- How does it work?
- What factors make it work better?
- Who does it work for?
- Examples
- Concluding tips for Efficacy and Mechanism Evaluation triallists
- The role of efficacy and mechanism evaluation in the development of personalised therapies (stratified medicine)
- Role of therapeutic process evaluation
- Recommendations for research
- Conclusions and recommendations for research - Evaluation and validation of soci...Conclusions and recommendations for research - Evaluation and validation of social and psychological markers in randomised trials of complex interventions in mental health: a methodological research programme
- Clinical effectiveness: ginger - Treatments for hyperemesis gravidarum and nause...Clinical effectiveness: ginger - Treatments for hyperemesis gravidarum and nausea and vomiting in pregnancy: a systematic review and economic assessment
- Discussion and conclusions - Feasibility of in-home monitoring for people with g...Discussion and conclusions - Feasibility of in-home monitoring for people with glaucoma: the I-TRAC mixed-methods study
- Methods - The Role of Ultrasound Compared to Biopsy of Temporal Arteries in the ...Methods - The Role of Ultrasound Compared to Biopsy of Temporal Arteries in the Diagnosis and Treatment of Giant Cell Arteritis (TABUL): a diagnostic accuracy and cost-effectiveness study
- Methods - Use of drug therapy in the management of symptomatic ureteric stones i...Methods - Use of drug therapy in the management of symptomatic ureteric stones in hospitalised adults: a multicentre, placebo-controlled, randomised controlled trial and cost-effectiveness analysis of a calcium channel blocker (nifedipine) and an alpha-blocker (tamsulosin) (the SUSPEND trial)
Your browsing activity is empty.
Activity recording is turned off.
See more...