Chapter 3Clinical effectiveness results

Publication Details

Participant flow

The flow of participants is illustrated in CONSORT flow diagrams in Figures 35.

FIGURE 3. Consolidated Standards of Reporting Trials diagram: recruitment to the GYY trial in the pilot phase waves.

FIGURE 3

Consolidated Standards of Reporting Trials diagram: recruitment to the GYY trial in the pilot phase waves.

FIGURE 5. Consolidated Standards of Reporting Trials diagram: randomisation and follow-up in the GYY trial.

FIGURE 5

Consolidated Standards of Reporting Trials diagram: randomisation and follow-up in the GYY trial. a Withdrawals and deaths over time are cumulative.

FIGURE 4. Consolidated Standards of Reporting Trials diagram: recruitment to the GYY trial in the main phase waves.

FIGURE 4

Consolidated Standards of Reporting Trials diagram: recruitment to the GYY trial in the main phase waves.

Participants were recruited from 15 GP practices across 6 CRNs: Yorkshire and Humber (2 sites in Harrogate and 1 in Hull); North West Coast (2 sites in Wirral); Kent, Surrey and Sussex (1 site in Kent); Health and Care Research Wales (1 site in Newport); West of England (4 sites in Bristol) and Thames Valley and South Midlands (2 sites in Oxford, 1 in Wantage and 1 in Banbury).

These 15 practices had a total estimated list size of 320,512, of which 13,070 (4.1%) were sent an invitation pack between July 2019 and August 2021. A number of participants invited in the first wave of recruitment, who were potentially eligible but were not randomised in the first wave due to sufficient numbers being reached to fill the GYY courses, were reinvited and rescreened in the second wave of recruitment (n = 285). A response to the invitation pack was received from 1,297 participants (9.9% of 13,070). A quarter declined participation (n = 308, 23.7%) (see Table 3) or withdrew their interest after initially providing consent (n = 22, 1.7%) and 252 (19.4%) were ineligible (see Table 4). The remaining 261 (20.1%) were not randomised for other reasons, most commonly that sufficient participants had been recruited to fill the GYY courses (n = 243).

Table Icon

TABLE 3

Reason for declining participation in the GYY trial

Table Icon

TABLE 4

Reason for ineligibility in the GYY trial

In total, between 18 October 2019 and 5 October 2021, 454 eligible and consenting participants were randomised: 240 to the intervention and 214 to usual care. Participants were randomised across 19 sites (mean 23.9 per site, SD 4.9, median 24, range 16–35). Seven sites delivered face-to-face GYY courses, and 12 were online. Twelve participants were randomised to intervention for every online course and either 12 or 15 (median 15) for every face-to-face course (see Report Supplementary Material 2).

Baseline characteristics of randomised participants

The mean age of randomised participants was 73.5 years (range 65–99); 60.6% were female, and participants had a median of three chronic conditions (see Table 5). The most commonly reported conditions were cardiovascular diseases (n = 307 participants, 67.6%), which included participants who reported at least one of coronary heart disease, hypertension, heart failure or peripheral arterial disease (of which hypertension was the most prevalent) and arthritis (osteo or rheumatoid arthritis, n = 242, 53.3%) (see Table 6). The intervention and usual care groups were reasonably comparable in terms of baseline characteristics, except that there was a slightly higher proportion of females in the intervention group (64.2%) than usual care (56.5%).

Table Icon

TABLE 5

Baseline characteristics of randomised participants by group

Table Icon

TABLE 6

Self-reported health conditions at baseline by randomised group

Participants were asked at baseline about their expectations and preferences in relation to the health care offered in the GYY trial (see Table 7). Half of the respondents (n = 235, 52.3%) thought that usual care would be fairly or very effective at improving their quality of life, and a slightly higher proportion (n = 277, 61.0%) thought the GYY programme would be fairly or very effective at improving their quality of life. Given the choice, three-quarters (n = 339, 74.7%) said they would prefer to be allocated to the intervention group rather than usual care alone. Most of the rest had no preference (n = 103, 22.7%), and only a small number preferred usual care (n = 12, 2.6%).

Table Icon

TABLE 7

Participant-reported expectations and preferences concerning the health care being offered in the GYY trial, collected at baseline, by randomised group

The baseline values of the primary and secondary outcome measures are summarised in Table 8 and are reasonably well-balanced between groups.

Table Icon

TABLE 8

Values of outcome measures assessed at baseline by randomised group

Withdrawals and follow-up

Participant follow-up was completed in October 2022.

In total, we became aware of seven deaths [2 (0.8%) in the intervention group and 5 (2.3%) in the usual care group]. Six of these occurred within the 12 months from randomisation, and one just beyond this. For one death that occurred within 12 months, we only became aware of the event after their 12-month questionnaire was sent out.

A further 36 participants [14 (5.9%) intervention participants and 22 (10.5%) usual care participants] withdrew from follow-up data collection during the trial; 15 before month 3, 12 between months 3 and 6, 8 between months 6 and 12 and 1 just beyond 12 months (participants contacted the research team upon receipt of their 12-month postal questionnaire to say they were unable to complete the questionnaire due to ill health).

The overall follow-up rates for the 454 randomised participants were 91.0% at month 3, 87.4% at month 6 and 85.9% at month 12 (see Table 9). At all time points, the return rate was slightly higher (by approximately 5 percentage points) in the intervention group than in the usual care group. Median time to completion was 7 days from the due date at both months 3 and 6 and 10 days at month 12 and was similar for the two groups at each time point.

Table Icon

TABLE 9

Return rates for post-randomisation follow-up questionnaires

At the 6-month time point, 70 (17.6%) of the questionnaires were completed over the phone with a researcher rather than on paper and returned by post. This was when COVID-19 restrictions prevented researchers from being in the office to facilitate the mailing and return of postal questionnaires.

Internal pilot phase

The progression criteria for the internal pilot phase were assessed after the last participant recruited in the pilot sites was followed up at 6 months and was graded against the pre-defined traffic light style thresholds:

  • Intervention provision

The pilot phase consisted of eight sites; these eight sites held the first intervention session of the GYY course between 13 and 20 days following participant randomisation. Therefore, the ‘green’ threshold was met for this criterion as at least three sites offered the first group yoga session within 3 weeks of participant recruitment.

  • Intervention acceptability

Of the 108 participants randomised to the intervention in the pilot phase, an average of 76 (70.4%) attended each GYY session. Therefore, the ‘amber’ threshold was met for this criterion, as between 65% and 80% of intervention participants were retained in the programme.

  • Recruitment

The eight sites each recruited between 16 and 28 participants. Therefore, the ‘green’ threshold was met for this criterion as at least three sites recruited ≥20 participants.

  • Six-month follow-up

Overall, 148 (85.1%) of the 174 participants recruited during the pilot phase provided valid EQ-5D-5L data at the 6-month follow-up. Therefore, the ‘green’ threshold was met for this criterion as completion rates exceeded 80%.

Although one criterion was graded amber, the rest were green, and so the TSC was satisfied to recommend that the trial continue without the need for any major changes in recruitment or retention processes.

Primary outcome (EQ-5D-5L utility index score) analysis

The EQ-5D-5L utility index score was assessed at baseline and at 3, 6 and 12 months post randomisation. The EQ-5D-5L utility index score is a value between 0 and 1, where a higher score indicates better health. The trial was powered to detect a difference of 0.06 (assuming a SD of 0.20).

Raw scores

Summaries of the EQ-5D-5L utility index score by trial arm and time point are presented in Table 10. At each time point, mean scores are slightly higher in the intervention arm than in the usual care arm. Overall, at baseline, the mean score was 0.739 (SD 0.169) and decreased over time to 0.707 (SD 0.214) at 12 months.

Table Icon

TABLE 10

Summaries of raw EQ-5D-5L utility index score by trial arm and time point

The correlation between baseline EQ-5D-5L utility index score and scores at the follow-up time points is: 3 months 0.72 (95% CI 0.67 to 0.77), 6 months 0.63 (95% CI 0.57 to 0.69) and 12 months 0.59 (95% CI 0.52 to 0.65).

Baseline characteristics of participants included in primary analysis

The primary analysis included participants with a valid EQ-5D-5L utility index score at baseline and at least one post-randomisation time point (n = 422, 93.0%; intervention n = 227, 94.6%; usual care n = 195, 91.1%). The baseline characteristics of these participants are included in Tables 1114; these are very similar to the randomised population, which indicates that there is little evidence that loss to follow-up has introduced attrition or selection bias.

Table Icon

TABLE 11

Baseline characteristics of randomised participants by group for those included in primary analysis

Table Icon

TABLE 14

Values of outcome measures assessed at baseline by randomised group as analysed

Table Icon

TABLE 12

Grouped conditions self-reported at baseline by randomised group as analysed

Table Icon

TABLE 13

Participant-reported expectations and preferences concerning the health care being offered in the GYY trial, collected at baseline, by randomised group as analysed

Primary end-point analysis

There was no evidence of a statistically or clinically significant difference in EQ-5D-5L utility index score between the intervention and usual care arms over 12 months, with an adjusted MD of 0.02 in favour of the intervention group (95% CI −0.006 to 0.045, p = 0.14). The predicted means and associated 95% CIs over time are presented in Table 15 and displayed in Figure 6, by group.

Table Icon

TABLE 15

Difference in adjusted mean EQ-5D-5L utility index score over time by randomised group from primary and SA models

FIGURE 6. Adjusted mean EQ-5D-5L utility index scores (with 95% CIs) for primary analysis over time by randomised group.

FIGURE 6

Adjusted mean EQ-5D-5L utility index scores (with 95% CIs) for primary analysis over time by randomised group.

Different covariance structures were applied to the model, and the Akaike information criterions (AICs) were compared. An unstructured pattern that models all variances and covariances separately was used in the final model, as this resulted in the lowest AIC.

Model fit diagnostics indicated that the standardised residuals demonstrated only a minor deviation from normality and were uniform against fitted values; therefore untransformed values were used in analyses.

Model coefficients for the covariates with 95% CIs are provided as software output in Appendix 2 to aid understanding of the fitted model, along with summaries of the EQ-5D-5L index value by trial site and time point to assess variation between sites (see Table 45).

Sensitivity analyses

Adjusting for other covariates (sensitivity 1)

Results were very similar when the primary analysis was repeated with age, gender and adapted Bayliss score additionally adjusted for as fixed effects (see Table 15).

Clustering by yoga teacher (sensitivity 2)

Nineteen yoga courses were delivered within the trial by 12 yoga teachers (1 teacher delivered 3 courses, 5 teachers delivered 2 courses each and 6 teachers delivered 1 course each). Analyses to account for possible clustering by yoga teacher were undertaken by including the intended yoga teacher as a random effect instead of site in the primary analysis model; results were virtually unchanged (see Table 15).

Compliance with random allocation and treatment received

One participant in the usual care group was invited to attend classes in error; they attended eight sessions, including five of the first six sessions.

A summary of attendance at weekly GYY sessions for intervention participants is presented in Table 16. Among the intervention group, 222 (92.5%) participants attended at least 1 GYY class, while 53 (22.1%) attended all 12 (see Figure 7). The mean number of sessions attended among all randomised intervention participants was 8.8 (SD 3.7, median 10) and 9.6 (SD 2.8, median 11) among those who attended at least one. Eighty per cent (n = 192) of participants attended at least six sessions, including at least three of the first six (see Table 17).

Table Icon

TABLE 16

Summary of GYY class attendance by week and recruitment wave

FIGURE 7. Number of sessions attended by GYY intervention group participants.

FIGURE 7

Number of sessions attended by GYY intervention group participants.

Table Icon

TABLE 17

Definitions of adherence to GYY intervention by recruitment wave

On average, the first class in a course took place 18.2 days (SD 2.7, median 19, range 13–21) after the participant was randomised, and classes were scheduled a median of 7 days apart (range 7–28; longer intervals tended to be due to the Christmas period) (see Report Supplementary Material 3).

Three CACE analyses for the primary outcome were undertaken to explore the impact of non-compliance on treatment effect estimates, with compliance defined as:

  • Attendance at one yoga session or more (n = 222 intervention participants, 92.5%; n = 1 usual care participant, 0.5%). The CACE estimate of the treatment effect is a difference of 0.025 at 12 months in favour of the intervention group (95% CI −0.002 to 0.052, p = 0.07). This difference is larger than the ITT estimate [The CACE analysis is not directly comparable with the primary ITT analysis as the CACE analysis cannot take account of the repeated measures for the EQ-5D-5L utility index score at 3, 6 and 12 months; it simply considers the difference at 12 months. Therefore, we conducted a linear regression with 12-month EQ-5D-5L utility index score as the outcome variable, adjusting for baseline score and gender with robust standard errors to account for clustering within site], but neither the treatment effect nor the upper 95% CI limit exceeds the clinically meaningful difference of 0.06.
  • Attendance of at least three of the first six sessions and at least three other sessions (n = 192 intervention participants, 80.0%; n = 1 usual care participant, 0.5%). The CACE estimate of the treatment effect is a difference of 0.029 at 12 months in favour of the intervention group (95% CI −0.002 to 0.059, p = 0.06). This difference is larger than the ITT estimate [The CACE analysis is not directly comparable with the primary ITT analysis as the CACE analysis cannot take account of the repeated measures for the EQ-5D-5L utility index score at 3, 6 and 12 months; it simply considers the difference at 12 months. Therefore, we conducted a linear regression with 12-month EQ-5D-5L utility index score as the outcome variable, adjusting for baseline score and gender with robust standard errors to account for clustering within site.], but neither the treatment effect nor the upper 95% CI limit exceeds the clinically meaningful difference of 0.06.
  • Number of sessions attended in its continuous form (intervention: mean 8.8, SD 3.7, one usual care participant attended eight sessions). The CACE estimate was a difference of 0.003 per session (95% CI −0.000 to 0.005, p = 0.07), indicating a very small additional benefit of the intervention for each session attended.

Yoga practices

Participant self-reported data on attendance at yoga classes and home yoga practice at 3-, 6- and 12-month follow-ups are presented in Tables 1820, respectively.

Table Icon

TABLE 18

Self-reported yoga practice at 3-month follow-up

Table Icon

TABLE 20

Self-reported yoga practice at 12-month follow-up

The self-reported data relating to attendance at GYY classes at the 3-month follow-up match well with the attendance register data (see Table 18). All 214 participants in the intervention group who self-reported as having attended a GYY session were recorded as having attended at least one session (8 participants were recorded on the attendance registers but did not return a 3-month questionnaire). The one usual care participant who self-reported as attending was the person we expected. Only a small number of participants in both groups reported attending other (non-trial) yoga sessions. Most of the intervention group reported practising yoga at home, which could include as part of the GYY programme (n = 185, 82.6%), but only a small number of usual care participants (n = 6, 3.2%). Where undertaken, participants in the intervention group did twice as many home yoga sessions as usual care participants (median 4 vs. 2), though these sessions tended to last for a similar length of time (median 15 minutes).

At 6 months, 72 (33.3%) intervention participants reported that they had attended a GYY session in the previous 3 months, all of whom were confirmed to have attended at least one session according to the class registers (see Table 19). It is likely the other attendees had completed their sessions more than 3 months prior to completing this follow-up questionnaire. Three usual care participants reported having attended a GYY session, though none of these were present on the class registers. GYY classes are available to the public through the BWY, so it is possible these participants had sought out and attended a GYY session not delivered as part of the trial, and therefore this would not be captured as part of our evaluation. The proportion of participants reporting home yoga practices decreased relative to month 3 (in accordance with the cessation of the GYY course) in the intervention group (as did the median number of home yoga sessions from 4 to 3) but increased very slightly in the usual care group.

Table Icon

TABLE 19

Self-reported yoga practice at 6-month follow-up

At 12 months, only a fifth of participants reported having attended GYY classes in the previous 6 months (n = 41, 19.3%) and two usual care participants (again, not participants who were on a class register) (see Table 20). As at 6 months, the proportion reporting home yoga practice decreased in the intervention group relative to the previous follow-up time point (to just less than half) but increased in the usual care group (9.6%). Participants in the intervention group reported doing a median of three home yoga sessions a week lasting a median of 15 minutes, while for the usual care group, this was a median of two sessions a week for 10 minutes.

Intervention fidelity

All yoga teachers submitted a course plan to the yoga consultant for pre-approval ahead of their first class. Each teacher received timely feedback on their plan. The feedback was mostly positive, and all plans met the assessment criteria and were therefore deemed appropriate for delivery.

Each yoga teacher underwent an observation of one of their trial classes by one of the yoga consultants. A fidelity check assessment form was completed for each observation. All yoga teachers passed all aspects of the fidelity check assessment criteria.

Subgroup analyses

Intended mode of delivery of Gentle Years Yoga

More participants were randomised in a site intended for online GYY delivery (61.9% across 12 sites; intervention group n = 144, 60.0%; usual care group n = 137, 64.0%) than face to face (38.1% across 7 sites; intervention group n = 96, 40.0%; usual care group n = 77, 36.0%). A subgroup analysis was conducted in which the primary analysis was repeated, including, as a fixed effect, an indicator for this factor plus an interaction with trial arm. There was no evidence of an interaction between trial arm and intended mode of delivery (interaction effect 0.007, 95% CI −0.042 to 0.057, p = 0.77).

Secondary analysis

EuroQol-5 Dimensions, five-level version utility index scores at the secondary time points

Adjusted EQ-5D-5L utility index score means and group differences from the primary analysis model are presented in Table 22 and displayed in Figure 6. There was no evidence of a statistically significant difference at 3, 6 or 12 months, and none of the CIs for the differences contained the clinically meaningful difference of 0.06.

Table Icon

TABLE 22

Difference in adjusted means over time by randomised group for secondary outcomes

EuroQol-5 Dimensions, five-level version visual analogue scale

Raw EQ-5D-5L VAS scores are summarised in Table 21. Adjusted means and group differences are presented in Table 22. The analysis included data from 423 participants (intervention n = 227, 94.6%; usual care n = 196, 91.6%). There was no evidence of a statistically significant difference at any time point.

Table Icon

TABLE 21

Summary of raw scores for secondary outcomes

Generalised Anxiety Disorder-7

Raw GAD-7 scores are summarised in Table 21. Adjusted means and group differences are presented in Table 22. The analysis included data from 420 participants (intervention n = 227, 94.6%; usual care n = 193, 90.2%). There was no evidence of a statistically significant difference at any time point.

Patient Health Questionnaire-8

Raw PHQ-8 scores are summarised in Table 21. Adjusted means and group differences are presented in Table 22. The analysis included data from 419 participants (intervention n = 227, 94.6%; usual care n = 192, 89.7%). There was no evidence of a statistically significant difference at any time point.

University of California, Los Angeles-3 loneliness

Raw UCLA-3 scores are summarised in Table 21. Adjusted means and group differences are presented in Table 22. The analysis included data from 419 participants (intervention n = 227, 94.6%; usual care n = 192, 89.7%). There was no evidence of a statistically significant difference at any time point.

English Longitudinal Study of Ageing single-item direct loneliness question

Raw ELSA single-item direct loneliness question scores are summarised in Table 21. Adjusted means and group differences are presented in Table 22. The analysis included data from 421 participants (intervention n = 227, 94.6%; usual care n = 194, 90.7%). There was no evidence of a statistically significant difference at any time point.

Patient-Reported Outcomes Measurement Information System-29

Raw PROMIS-29 scores are summarised in Table 23. Adjusted means and group differences are presented in Table 24. These analyses included data from between 419 and 421 participants. There was evidence of a statistically significant difference in the T-score for the pain interference subscale of the PROMIS-29 at 3 months (−1.44, 95% CI −2.63 to −0.26; p = 0.02) and over the 12 months (−1.14, 95% CI −2.24 to −0.04; p = 0.04) and in the global (pain intensity) PROMIS-29 item at 12 months (−0.45, 95% CI −0.83 to −0.08; p = 0.02) and over the 12 months (−0.32, 95% CI −0.61 to −0.04; p = 0.03). Differences favoured the intervention. Otherwise, no statistically significant differences were observed.

Table Icon

TABLE 23

Summary of raw scores for PROMIS-29 secondary outcomes

Table Icon

TABLE 24

Difference in adjusted means over time by randomised group for PROMIS-29 outcomes

Falls

A total of 421 participants responded to the question asking whether they had had a fall in the previous 3 or 6 months on at least one of the post-randomisation questionnaires, of which 112 (26.6%) said they had [60/227 (26.4%) in the intervention group and 52/194 (26.8%) in usual care].

A mean of 0.82 (SD 2.0, median 0, range 0–21) falls per person was reported over an average of 10.5 (SD 3.6, median 12) months [intervention 0.91 (SD 2.1, median 0, range 0–21) falls over 10.8 (SD 3.2, median 12) months; usual care 0.71 (SD 1.9, median 0, range 0–15) falls over 10.2 (SD 3.9, median 12) months]. There was no evidence of a statistically significant difference in the rate of falls reported over the 12 months of follow-up (incidence rate ratio 1.38, 95% CI 0.95 to 2.01, p = 0.09).

Adverse events

There were no reported serious and related AEs.

There were seven reported non-SAEs that were deemed to be at least possibly related to the intervention and that were expected (see Table 25). These were reported for seven participants, all in the intervention group. The events all related to the onset or aggravation of pain during or after the yoga sessions (back pain n = 3, shoulder n = 1, knee n = 1, knee and shoulder n = 1, thigh n = 1), though none required medical attention beyond taking pain killers. Four of the events were recorded as resolved in their initial report. Of the three that were ongoing, two were subsequently followed up, and the events were deemed to be resolved without the need for further medical intervention. Three of the seven participants subsequently withdrew from the intervention due to the pain, including the two participants for whom the event was deemed definitely related.

Table Icon

TABLE 25

Summary of non-SAEs