Independent economic assessment: conventional transarterial therapy-ineligible population

Matthew Walton; Ros Wade; Lindsay Claxton; Sahar Sharif-Hurst; Melissa Harden; Jai Patel; Ian Rowe; Robert Hodgson; Alison Eastwood

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Walton M, Wade R, Claxton L, et al. Selective internal radiation therapies for unresectable early-, intermediate- or advanced-stage hepatocellular carcinoma: systematic review, network meta-analysis and economic evaluation. Southampton (UK): NIHR Journals Library; 2020 Sep. (Health Technology Assessment, No. 24.48.)

Cover of Selective internal radiation therapies for unresectable early-, intermediate- or advanced-stage hepatocellular carcinoma: systematic review, network meta-analysis and economic evaluation

Selective internal radiation therapies for unresectable early-, intermediate- or advanced-stage hepatocellular carcinoma: systematic review, network meta-analysis and economic evaluation.

Show details

Contents

< Prev Next >

Chapter 7Independent economic assessment: conventional transarterial therapy-ineligible population

A summary of the key features of the AG economic analysis for the CTT-ineligible population is presented in Table 26. The population covered by the AG base-case analysis is Child–Pugh class A patients, who are ineligible or who have failed CTT. Scenario analysis considers two further subgroups: (1) patients who have a low tumour burden and are ALBI 1 and (2) patients with MVI.

TABLE 26

Summary of key features of the AG base-case model

It should be noted that these analyses are limited in that they do not include all patients who are ineligible to receive or have failed CTT, as they do not cover Child–Pugh class B patients ineligible for CTT. In practice, these patients would be ineligible to receive systemic therapy as they are not covered by the relevant NICE recommendations and, therefore, in practice would receive BSC. The clinical evidence available comparing SIRT with BSC in an advanced-HCC population is, however, very limited, and as such it is not possible to extend the economic analysis to cover this population.

The interventions considered in the AG analysis were the three SIRTs (QuiremSpheres, SIR-Spheres and TheraSphere) and the comparators were the systemic therapies sorafenib and lenvatinib. Regorafenib was not included as a comparator in the AG’s analysis as the NICE recommendation and SmPC for regorafenib in HCC permits use only in patients who have previously failed sorafenib therapy. Patients in the AG model are, however, permitted to move on to regorafenib following discontinuation of sorafenib.

In all analyses, cost-effectiveness is evaluated in terms of the incremental cost per QALY gained over a lifetime time horizon from an NHS and PSS perspective. In line with the NICE reference,¹⁴² case costs and health benefits were discounted at a rate of 3.5% per annum. Costs in the model were based on the 2017/18 price year.

Model structure

The structure of the AG model is presented in Figure 14. The AG model consists of a three-state partitioned survival model and decision tree for those intended to receive SIRT. Also presented is the structure of the downstaging scenario (see dashed lines), for which the outcomes of patients successfully downstaged to receive curative therapy are modelled separately. In the AG model, those allocated to receive SIRT enter a decision tree representing the work-up procedure. A proportion of these patients go on to receive SIRT following work-up, whereas others are not considered suitable for SIRT or otherwise withdraw consent, so can go on to receive either BSC or a systemic therapy. In the AG base case, patients then move into the main partitioned survival model.

FIGURE 14

Overview of the CTT-ineligible population AG model structure (with dashed curative therapy scenario). (a) Work-up outcome decision tree; and (b) post-SIRT Markov model.

The proportion of patients who receive work-up in the AG base case is based on the SARAH trial,¹⁹ from which efficacy outcomes for these patients are drawn. Of the 226 patients who underwent work-up, 42 (18.6%) did not receive SIRT. Two further scenarios are presented in Scenario analyses, which explore the effect of using the lower and upper bounds of work-up ‘failure’ identified in the literature (5%¹⁴³ to 28.6%²¹).

The model uses a lifetime (10-year) time horizon (< 0.1% of patients alive at 10 years in the most optimistic scenario), and takes an NHS and PSS perspective. Costs and health outcomes are discounted at a rate of 3.5% per annum, with cost-effectiveness expressed in terms of the incremental cost per QALY gained and incremental net monetary benefit (NMB). Costs were valued at 2017/18 prices.

As shown in Figure 14, the structure of the partitioned survival model is broadly similar to that adopted within both the BTG and Sirtex models (see Chapter 5, Review of economic evidence submitted by companies), consisting of three health states: (1) progression free, (2) post progression and (3) dead. For any time, t, the probability that a patient is alive and progression free is given by the cumulative survival probability for PFS, whereas the probability that a patient is alive is given by the cumulative survival probability for OS. The probability that a patient is in the post-progression state at any time, t, is given by the difference between the cumulative survival probabilities for PFS and OS. Health and cost outcomes from the partitioned survival models for each intervention were multiplied by the proportion of patients who received each within the particular treatment arm as per the decision tree.

As with the Sirtex model, HRQoL is defined according to the presence or absence of disease progression as well as treatment received. The model includes costs associated with SIRT procedures (work-up costs, acquisition costs and procedure costs), drug acquisition, health-state costs (consultant-led outpatient visits, nurse-led outpatient visits, electrocardiography, blood tests and CT scans), costs associated with managing grade 3 or 4 AEs, BSC-related costs (consultant-led outpatient visits, CT scans, MRI scans, specialist palliative care visits and palliative radiotherapy) and end-of-life care costs.

Model input parameters

A summary of the data sources used to populate the AG’s base-case model is presented in Table 27. These are discussed in greater depth over the following sections.

TABLE 27

Summary of sources of input parameters in the AG base-case economic model

Treatment effectiveness

The base-case analysis used data from the SARAH,¹⁹ SIRveNIB²¹ and REFLECT trials.⁸¹ Scenario analyses also drew on a number of observational comparisons of SIR-Spheres and TheraSphere (see Chapter 4, Network 3: adults with unresectable hepatocellular carcinoma who are ineligible for conventional transarterial therapies, for details).

The comparison of SIR-Spheres with sorafenib was based on pooled data from the SARAH and SIRveNIB trials. Modelled data from SARAH were supplied by Sirtex for both PFS and OS, and data were extracted from published literature sources from SIRveNIB.

The source of modelled survival data from the SARAH and SIRveNIB trials differed according to therapy received. For patients receiving sorafenib, OS and PFS outcomes were based on the ITT populations (sorafenib, n = 400), whereas OS and PFS outcomes for patients receiving SIR-Spheres are modelled based on the per-protocol population of each trial (SIR-Spheres, n = 304). This is done to account for the proportion of patients who fail the SIRT work-up procedure, and subsequently do not undergo the main SIRT procedure. The outcomes of patients who fail the work-up procedure are modelled independently, and are based on near-complete KM data from the SARAH trial (work-up failures, n = 42). The proportion of patients failing the work-up procedure is based on the SARAH trial. The DSA included a range of estimates for work-up failure, based on the number of work-up failures reported in SARAH and SIRveNIB and other estimates provided by Sirtex. To avoid the double-counting of patients who are downstaged to receive curative therapies, the data included from SARAH, for both SIR-Spheres and sorafenib, are censored for downstaging. There was no downstaging reported in the SIRveNIB trial publication²¹ and no patients received subsequent therapies that could be considered ‘curative’, so it was assumed that no patients were downstaged to receive curative therapies in these data.

The comparative effectiveness of lenvatinib was drawn from the NMA presented in Chapter 4, Results. The HR for lenvatinib versus sorafenib was applied to the Weibull curve fitted to the sorafenib data drawn from the SARAH and SIRveNIB trials. Proportional hazards is, therefore, assumed between sorafenib and lenvatinib.

In the AG’s base-case analysis, equivalence is assumed between the SIRTs owing to a lack of randomised evidence on the relative effectiveness of each SIRT. An exploratory scenario analysis is also presented in which the effectiveness of TheraSphere was based on two non-randomised comparative studies³⁹^,⁴⁰ (SIR-Spheres, n = 34; TheraSphere, n = 78), with a HR versus SIR-Spheres drawn from the NMA. In this scenario, the HR is applied to the modelled parametric functions fitted to the pooled SIR-Spheres data and, therefore, proportional hazards is assumed for this comparison (see Extrapolation of progression-free survival and overall survival evidence for consideration of the plausibility of this assumption).

In addition to the base-case analysis in which the modelled population was based on pooled analysis of the SARAH and SIRveNIB trials, additional scenario analysis was implemented in a number of alternative populations. To account for uncertainties in the relevance of the Asia-Pacific population to UK practice, a scenario was implemented using data only from the SARAH trial. Two further subgroup analyses based on the SARAH trial were also considered: the restricted low-tumour burden and ALBI 1 subgroup (SIR-Spheres, n = 28; sorafenib, n = 44), and patients with MVI (SIR-Spheres, n = 64; sorafenib, n = 81). In both subgroup analyses, the comparison between SIR-Spheres and sorafenib is made using data drawn from the relevant subgroup of the SARAH trial only. Appropriate IPD were requested by the AG for these subgroups of the SIRveNIB trial but Sirtex had only limited access to the IPD from the SIRveNIB trial and did not have subgroup data from all enrolling centres. Subgroup data were not available to support the comparative effectiveness of lenvatinib and TheraSphere. This scenario, therefore, uses only data for SIR-Spheres and sorafenib, assuming equivalent efficacy across SIRTs and between lenvatinib and sorafenib.

Extrapolation of overall survival and progression-free survival evidence

For each data set, model selection was conducted in line with the process described in the NICE DSU Technical Support Document 14.¹¹⁷ To assess the appropriateness of alternative parametric models, log-cumulative hazard plots were produced to illustrate and assess the hazards observed in the trial. Curve fitting was conducted using the ‘survival’ and ‘flexsurv’ packages in R (The R Foundation for Statistical Computing, Vienna, Austria). Exponential, Weibull, Gompertz, log-normal, log-logistic, gamma and generalised gamma models were considered.

The AIC and BIC fit statistics were examined to assess the comparative internal validity of competing models. The final choice of models for the economic analysis was made on the basis of fit to the observed data as well as consideration of the clinical plausibility of candidate models.

Overall survival

The analysis of OS for the base-case analysis was based on time-to-event data from the SARAH trial supplied by Sirtex, and KM curves from the SIRveNIB trial.²¹ Pooled KM curves for the base-case population are presented in Appendix 16, Figures 26 and 27. Survival estimates can be found in Appendix 16, Table 71.

Standard parametric survival functions were fitted to the survival data available for each of the considered populations, and log-cumulative hazard plots were generated to assess any changes in hazards over time (see Appendix 16, Figure 28). Plots of each of the fitted parametric models with the observed KM OS curves are presented in Figures 15 (SIR-Spheres) and 16 (sorafenib). Model fit statistics are summarised in Appendix 16, Table 72, which showed that the generalised gamma model had the best fit, with the log-normal and log-logistic curves also having similar statistical fit, thereby providing little justification to discriminate between these models on this basis of fit statistics. The generalised gamma, log-normal and log-logistic models are, however, all accelerated failure time models and, as such, a HR cannot be applied to estimate outcomes for lenvatinib patients, and would likewise not permit scenarios in which differential outcomes are assumed for TheraSphere, which would similarly require the application of a HR. To accommodate the use of HRs, the AG base-case analysis, therefore, selected the Weibull function, which has the best statistical fit from the remaining curves, and was considered the most clinically plausible. The AG considered this reasonable given the limited data to accommodate accelerated failure time functions and the small variation in predicted incremental survival across all six functions, but acknowledges this as a limitation of the presented base-case analysis. Scenario analysis is, therefore, presented, in which the generalised gamma, log-normal and log-logistic functions are used to model OS. In these scenarios, equivalence is assumed between sorafenib and lenvatinib.

FIGURE 15

Extrapolation of OS: SIR-Spheres.

FIGURE 16

Extrapolation of OS: sorafenib.

For scenarios run on the SARAH trial¹⁹ subpopulations described previously, the Weibull function was retained to model OS outcomes. Fit statistics for the SARAH trial whole population, low tumour burden/ALBI 1 subgroup and no-MVI subgroup are reported in Appendix 16, Table 74. Plots of each of the fitted parametric models with the observed KM OS curves are presented in Appendix 16, Figures 30 and 31 (SIR-Spheres) and Figures 32 and 33 (sorafenib). In all three scenarios, the Weibull function had a good statistical and visual fit to the observed data.

Progression-free survival

The analysis of PFS for the base-case analysis was based on supplied time-to-event data from the SARAH trial¹⁹ and KM curves from the SIRveNIB trial.²¹

Similar to the approach previously described for OS, standard parametric survival functions were fitted to the survival data available for each of the considered populations (Figures 17 and 18), and log-cumulative hazard plots generated to consider the change in hazards over time (see Appendix 16, Figure 29). Plots of each of the fitted parametric models with the observed KM OS curves are presented in Appendix 16, Figures 34 and 35 (SIR-Spheres) and Figures 36 and 37 (sorafenib). Similar to OS, model fit statistics for the generalised gamma, log-normal and log-logistic functions were superior to other functions (see Appendix 16, Table 73). These functions were, however, rejected to accommodate the application of a HR for lenvatinib and the implementation of scenarios assuming differential effectiveness for TheraSphere. The Weibull function was, therefore, selected in the AG base-case analysis as this had the best statistical and visual fit to the observed data and was considered clinically plausible.

FIGURE 17

Extrapolation of PFS: SIR-Spheres.

FIGURE 18

Extrapolation of PFS: sorafenib.

Overall survival for patients downstaged to curative therapy

The base-case analysis does not allow for downstaging to curative therapies, owing to uncertainties over whether or not this is realistic in a population of patients with advanced disease. A number of scenarios are presented in which downstaging is allowed for. The proportion of patients downstaged is based on the values reported in the SARAH trial¹⁹ and varied depending on the efficacy subgroup used (see Appendix 16, Table 69). Outcomes for patients downstaged to curative therapy were based on a US prospective cohort study,¹¹² which recruited 267 patients with HCC, including 191 with intermediate and advanced disease. This study compared outcomes for patients who had received palliative care with those who received potentially curative therapies (liver transplantation, surgical resection or tumour ablation). Using Cox multivariate proportional hazards, the HR for OS with potentially curative treatments versus non-curative treatment was 0.29 (95% CI 0.18 to 0.47). This HR was applied to the pooled sorafenib ITT arms of the SARAH and SIRveNIB trials in all scenarios. This was carried out to prevent the outcomes of downstaged patients varying depending on the patient population selected or by treatment arm; advice from clinical advisors to the AG suggested that outcomes post-curative therapy would be similar regardless of patient characteristics or treatment received to achieve downstaging. The sorafenib ITT arm was used as this was considered to best match care received in the analysed patient cohort, and is most representative of the current standard of care in UK practice.

Adverse event rates

The probability of experiencing grade 3 or 4 AEs for SIR-Spheres and sorafenib was taken directly from the per-protocol population of the SARAH trial.¹⁹ Based on clinical advice received by the AG, AE rates for TheraSphere and QuiremSpheres were assumed to be the same as for SIR-Spheres. AE rates for lenvatinib were drawn from the REFLECT trial.⁸¹ See Appendix 16, Table 70, for rates applied.

Health-related quality of life

Literature review and mapping of health-related quality-of-life estimates

A targeted review of published studies reporting utility estimates for patients with HCC or cirrhosis was undertaken to supplement data extracted from studies on SIRT and its comparators. Details of the search strategy used are described in Appendix 3. The objective of these searches was to identify health state utilities of patient populations that may not have been captured in studies included in the main systematic reviews. The required utilities included:

decompensated cirrhosis (any cause)
post-CTT disutility
post-resection disutility
pre- and post-transplant utilities.

The identified studies recorded HRQoL using a number of tools, namely Short Form questionnaire-36 items (SF-36) and EORTC QLQ-C30. NICE prefers the use of generic preference-based measures (i.e. EQ-5D) for the calculation of health state utilities. Therefore, mapping algorithms typically based on multinomial regression model coefficients can be used to transform disease-specific measures of health status into a EQ-5D-based utility score. Domain scores for relevant populations were mapped onto EQ-5D using the two-part beta model as developed by Woodcock and Doble¹⁴⁴ for EORTC QLQ-C30 scores, and a model developed by Rowen et al.¹⁴⁵ was used to transform SF-36 outcomes.

Modelled health state utilities

The AG’s base-case model for CTT-ineligible patients applies different health state utilities based on the type of therapy received to reflect any differences in their respective AE burdens. Because utilities were drawn from patients in the SARAH trial, disutilities associated with type and length of any AEs were assumed to have been captured, and thus were not considered separately. In the absence of any evidence suggestive of a difference in HRQoL between the three SIRTs, the AG has assumed that patients experience the same quality of life regardless of whether they received SIR-Spheres, TheraSphere or QuiremSpheres. Likewise, the HRQoL estimates associated with the systemic therapies, namely sorafenib and lenvatinib, are assumed to be the same as one another, but marginally lower than those applied to SIRT, as observed in the SARAH trial¹⁹ (see Table 28). An additional scenario in which health state utilities from the lenvatinib technology appraisal are applied is presented in Scenario analyses.

TABLE 28

Health state utilities included in the AG CTT-ineligible model

Age-related disutilities

Age-adjusted UK population norms from Szende et al.¹⁴⁶ were applied to the utility values included in the model. Age-related decrements were estimated in the form of a multiplier, with decrements applied relative to the populations on entering the model. This allows for the trial-derived utilities applied in the model to account for age-related decline in HRQoL as the population ages over time.

Selective internal radiation therapy health state utilities

The health state utilities associated with SIRT in the CTT-ineligible model were based on the per-protocol subgroup of the SARAH trial as calculated by Sirtex in its evidence submission (see Chapter 5, Evidence used to inform the company’s model, for details). EORTC QLQ-C30 summary scores were mapped to EQ-5D using the algorithm developed by Longworth et al.,¹¹³ and utilities were calculated based on UK general population weights.

The per-protocol utilities were considered to better reflect the HRQoL associated with SIRT than those derived from the ITT population, as 22.4% of patients randomised to SIRT did not receive SIRT in the SARAH trial. These patients may have received other systemic therapies or BSC, or were otherwise too unwell to receive SIRT; thus, the ITT utility values may not have represented those of a SIRT-treated population. There were no further utility decrements applied to these utilities as these are likely to have been captured in the SARAH trial results. The health state utilities applied in the model are presented in Table 28.

Systemic therapy health state utilities

Health state utilities applied to modelled patients receiving the systemic therapies sorafenib and lenvatinib were taken from the per-protocol subgroup of sorafenib patients in the SARAH trial.¹⁹ The difference in utility between SIRT and sorafenib in this subgroup was 0.011, which the AG considered to account sufficiently for the ostensibly greater burden of AEs associated with these drugs. Utilities applied to patients who received work-up but ultimately did not receive SIRT were weighted by the proportion on systemic therapy versus BSC (61.9% and 38.1%, respectively). This assumes that patients not on systemic therapy had a utility equivalent to those on SIRT, which may overestimate the HRQoL of BSC patients, as a proportion were likely to have been too unwell to receive systemic therapy.

Post-transplant health state utilities

The AG scenarios 6 and 10 include the possibility for downstaging; therefore, post-transplant utilities were considered for use in the model. Pre-transplant health state utilities are assumed to be equal to those experienced in pre-progression for SIRT, systemic therapies and BSC. Post-transplant health state utilities are assumed to be equal to those experienced on SIRT, regardless of which treatment a patient received before downstaging to transplant. However, it is likely that patients who received a transplant may have a better HRQoL than the per-protocol population of the SARAH trial.

Despite multiple studies showing that recipients of liver transplant enjoy increased HRQoL post transplant in comparison with pre transplant,¹¹³^,¹⁴⁷^–¹⁴⁹ a lack of generalisability between these studies and the population included in the model renders the absolute utility values reported in the literature too uncertain for inclusion. Studies also show that HRQoL remains lower for liver transplant recipients than for healthy patient controls.¹⁵⁰^–¹⁵² However, as with the pre- and post-transplant utilities, there is insufficient evidence to suggest that these studies are generalisable to the modelled population. Given the lack of evidence to definitively suggest that utility values in the post-transplant HCC population are lower than in the general population, the AG believes that the utility values observed in the general population represent the upper bound of the utility expected in the post-transplant population.

Sources of resource utilisation and cost data

A targeted review of published studies reporting resource use and cost data for patients with HCC or cirrhosis was undertaken. Details of the search strategy used are described in Appendix 4. This review, however, identified little in the way of published literature. Resource use and cost inputs used in the AG’s economic model were, therefore, derived primarily from targeted literature searches, previous NICE technology appraisals and the estimates presented in the companies’ evidence submissions for the present appraisal. Overall costs are determined by treatment costs (acquisition, procedures and monitoring), changes in health service utilisation driven by disease status (i.e. progression free, progressed disease and death) and AE management. The assumptions applied to each category are discussed in the following sections. Note that confidential Patient Access Scheme (PAS) discounts are available but not included here for QuiremScout, sorafenib, lenvatinib and regorafenib. Please refer to Appendix 17 for results including all PAS discounts. A summary of the AG model cost inputs is presented in Summary of Assessment Group base-case analysis inputs and assumptions.

Treatment costs and resource use

Work-up costs and number of procedures

Patients allocated to receive SIRT must first undergo a work-up procedure to assess their suitability for treatment with SIRT, and to plan the procedure through angiographic evaluation and occlusion of any vessels that could carry microspheres away from the liver to the gut. Although work-up is a one-off procedure, those patients who required a second SIRT procedure owing to an unsuccessful or incomplete first procedure are likely to need a second work-up.

In the SARAH trial,¹⁹ 17 of the 184 patients who received SIRT required re-treatment owing to an unsuccessful or incomplete first procedure (nine received a second work-up but were not re-treated). Therefore, patients who received any of the SIRTs incurred the cost of 1.09 work-up procedures to account for re-treatment. As the model independently considered the costs and outcomes for patients who underwent work-up but ultimately did not receive SIRT, these individuals were assumed to receive 1.0 work-up procedures. The AG’s base case assumed that 18.6% of patients who underwent work-up did not go on to receive SIRT in line with the SARAH trial¹⁹ data. However, in recognition of the uncertainty around this value, a number of alternative scenarios are presented in Sensitivity analyses results.

Work-up costs used in the AG base case were based on the values BTG elicited from The Christie NHS Foundation Trust (see Appendix 15, Table 60). The largest expenditures were staff costs and SPECT/CT. The total cost of a single work-up procedure for SIR-Spheres and TheraSphere used in the AG model was £860.32, and the work-up cost of £5178.32 for QuiremSpheres comprised the list price of QuiremScout and the BTG-elicited value excluding the £74 cost of the technetium-99m MAA agent. This does not include the PAS discount available for QuiremScout.

Selective internal radiation therapy treatment costs and number of procedures

Patients in the AG model received an average of 1.21 SIRT procedures. This is based on the assumption that patients requiring bilobar treatment will require two separate SIRT procedures, separated by a few weeks (as per the SARAH protocol¹⁵³), and that patients will be re-treated owing to an incomplete or unsuccessful first treatment. The clinical advisors to the AG stated that it would be very unlikely that both lobes would be treated in the same treatment session in UK practice owing to an increased risk of REILD. SIRT patients in the SARAH study¹⁹ had 1.28 separate SIRT treatments on average [222 treatments, 173 patients (one or two treatments only)]. This broadly reflects the results of the Sirtex resource use survey (1.2 treatments per patient). This value excludes the 11 patients who had three separate SIRT treatments, and includes only one procedure for the nine patients who received a second treatment owing to disease progression, as it was unclear whether or not this would be permitted in UK practice.

The acquisition cost of a single SIRT treatment was taken from each company submission: SIR-Spheres, £8000; TheraSphere, £8000; and QuiremSpheres, £9896.

The cost of the SIRT procedure applied in the AG model was taken from National Schedule of Reference Costs 2017–2018¹⁰⁷ (YR57Z). The average cost of ‘Percutaneous, Chemoembolisation, or Radioembolisation, of Lesion of Liver’ was £2790. This cost was incurred for each separate SIRT administration for patients receiving TheraSphere and QuiremSpheres in the AG model. The Sirtex company submission¹⁰² stated that SIR-Spheres administration procedures use intermittent contrast-medium injection to assess the distribution of the microspheres under radiography over the course of approximately 1 hour. The AG, therefore, included an additional cost of £209 for the SIR-Spheres administration procedure (RD32Z – Contrast Fluoroscopy Procedures with duration of more than 40 minutes), for a total of £2999.

Costs of systemic therapies

The pack costs for sorafenib (£3576.56), lenvatinib (£1437.00) and regorafenib (£3744.00) were taken from the BNF.¹¹⁵ The confidential PAS discounts available for sorafenib, lenvatinib and regorafenib are not included in this report. For results of the AG’s economic analysis that include these discounts, please refer to Appendix 17.

The daily dose of sorafenib used in the AG base case was based on the SARAH trial¹⁹ (648.5 mg), and the mean time on treatment was calculated by applying an exponential function to the median time on treatment reported in the SARAH trial¹⁹ (exponential mean 122.95 days).

The base-case daily dose of lenvatinib was 10.2 mg per day, based on the Western subgroup of the REFLECT trial⁸¹ for lenvatinib. This value was considered by the technology appraisal committee in TA551¹² to better represent the average weight-based dose used in UK practice. The AG considered the time on treatment reported in the REFLECT trial⁸¹ for lenvatinib to be excessively long compared with SARAH,¹⁹ and reflective of differences in the baseline characteristics of the populations recruited to these trials. To avoid inflating the relative cost of lenvatinib, the AG applied the reported HR of PFS between lenvatinib and sorafenib in REFLECT to the SARAH time on treatment to produce an estimate of 124.07 days on treatment.

Wastage was accounted for in the AG model using the simple assumption that if a new pack was started then in the case of treatment discontinuation, the remainder could not be used to treat other patients. However, this may be a conservative assumption, as it was reported in TA555¹³ that many centres have measures in place to reduce wastage of expensive cancer treatments, such as issuing only a 1-month supply of tablets at a time (approximately one pack of sorafenib). However, as it generally cannot be predetermined when therapy will be discontinued owing to AEs, death or non-compliance, it can be reasonably assumed that some wastage will occur.

Cost of subsequent treatment

The interventions used following first-line treatment in the SARAH trial¹⁹ were not representative of current UK practice; however, as the efficacy data used in the model are derived from these patients, the trial values are most appropriate. Therefore, the proportion of patients who received subsequent systemic therapy (98% sorafenib) following SIRT in the SARAH trial¹⁹ (28.8%) was used to estimate the size of this population in the AG model. The AG was advised that current NICE recommendations mean that lenvatinib is rarely used in practice, as this would preclude second-line use of regorafenib. Therefore, 95% of patients continuing to subsequent systemic therapies following SIRT treatment are assumed to receive sorafenib, and 5% are assumed to receive lenvatinib.

As a number of chemotherapeutic/systemic agents administered to patients following sorafenib in the SARAH trial¹⁹ have now been displaced in practice by regorafenib, or are otherwise no longer in use, the AG model assumes that the proportion of those who received systemic therapies after sorafenib in the trial (12.04%) would receive regorafenib in UK practice. A small proportion (3.47%; i.e. 12.04% of 28.8%) of SIRT patients also receive regorafenib following second-line sorafenib treatment. Duration of therapy and dose intensity of each of the three systemic agents modelled is assumed to be the same as first-line treatment, whereas regorafenib is assumed to have the same time on treatment as sorafenib (122.95 days), with a mean daily dose of 160 mg (RESORCE trial).¹⁰¹

Disease management costs

There are a number of issues with the health state unit costs used in previous technology appraisals in this indication, which precluded their use in the AG base case. The primary concern with these costs is that the original resource use surveys given to clinicians were based on the ongoing costs associated with sorafenib treatment. The resource use implications for systemic therapies may be very different with regard to monitoring and diagnostic testing to those for SIRT as a one-off procedure; therefore, these values may overestimate the disease management costs associated with the PFS health state for SIRT patients. Furthermore, the committee-preferred resource use data used in TA551¹²⁴ were collated from two resource use surveys conducted 10 years apart, generating very different estimates that may reflect differences in practice, costs and experience. As targeted therapies such as sorafenib were not yet in use at the time of this first survey, it is unlikely that these values are sufficiently representative of current practice.

In the light of these limitations, the AG used the results of a resource use survey conducted by Sirtex, which elicited information from 11 clinicians on the frequency and type of medical staff contact, monitoring and follow-up, hospitalisation frequency and length, and any use of PSS. Resource use pre progression, post progression and on progression were reported separately. Unit costs for each resource use item were derived from National Schedule of Reference Costs 2017–2018¹⁰⁷ and Personal Social Services Research Unit (PSSRU).¹⁰⁶ Differential costs were applied for systemic therapy patients during pre-progression, reflecting higher levels of ongoing diagnostic testing and additional follow-up contact.

The per-cycle post-progression costs applied in the AG model are significantly lower than those used in TA551¹²⁴ (£229.69 vs. £1268.16). This was driven primarily by greatly reduced use of hospital- and social care-based palliative care on progression since the original resource use survey. The health state costs used in the AG model are presented in Table 29.

TABLE 29

Assessment Group model health state costs

A scenario that instead uses the committee-preferred costs from the lenvatinib appraisal is presented in Sensitivity analyses results.

Adverse event costs

Costs associated with the management of AEs were derived from previous NICE TAs of HCC,¹¹^–¹³ using the latest National Schedule of Reference Costs 2017–2018¹⁰⁷ values or costs inflated to the 2018 cost year, where applicable. The AG base case used AE incidence rates from the SIR-Spheres arm of the SARAH trial¹⁹ for the three SIRTs, and from the sorafenib arm of this trial for sorafenib. AE rates for lenvatinib were taken from the REFLECT trial.⁸¹ For patients who received work-up but did not progress onto SIRT, the proportion of patients who received sorafenib incurred sorafenib AE management costs.

A full list of AE costs used in the AG model is presented in Appendix 16, Table 75.

Summary of Assessment Group base-case analysis inputs and assumptions

A summary of the resource use assumptions and costs applied in the AG base-case analysis is presented in Table 30.

TABLE 30

Summary of resource use and cost inputs in the AG model

Analytic methods

Base-case analysis

The AG produced fully incremental ICERs for each strategy included in the model; however, this approach generated a number of ICERs expressed in terms of dominance owing to the close similarity of health outcomes predicted for the SIRTs.

The AG, therefore, considered a net benefit framework to be the most appropriate approach to present the relative cost-effectiveness of the three SIRTs with existing practice. This method is often preferred when there are a number of technologies under comparison, particularly when incremental costs and benefits are very similar. Technologies with identical health outcomes and marginal differences in costs are often labelled as ‘dominant/dominated’ using incremental cost-effectiveness analysis with conventional decision rules. Considering net health benefit instead permits a more informative comparison of the effect of alternative strategies.

Net monetary benefit is calculated using a rearrangement of the ICER formula, but inherently compares the incremental health gain with the comparator with a willingness-to-pay (WTP) threshold. The NMB formula thereby assigns a value to the additional QALYs generated by an intervention, and considers the opportunity cost associated with generating these health benefits. The formula used to define NMB is λ × ΔE – ΔC, where the difference in health effects (ΔE) is multiplied by the selected WTP threshold (λ) minus the difference in costs (ΔC) (i.e. £30,000 in the results presented below). Using this approach, if an intervention has an incremental NMB of > 0, then it would be considered more cost-effective than the baseline option, in this case the least costly option. NMB results (including PAS discounts) at a £20,000 and £30,000 threshold are also presented in Appendix 17.

The AG model accounted for uncertainty using probabilistic and deterministic sensitivity analyses. PSA was undertaken using simple Monte Carlo sampling methods, using 20,000 samples for the AG base case and 5000 samples in the primary scenario analyses. The choice of distribution to reflect uncertainty around each parameter was selected for each according to its statistical suitability. To account for uncertainty around the parametric survival models fitted to OS and PFS, outcomes were sampled via Cholesky decomposition using the variance–covariance matrices produced during survival modelling. When a HR was used to estimate PFS and OS outcomes, alternate values were drawn in each model iteration from the NMA output from WinBUGS (CODA) to model uncertainty in the predicted treatment effects.

Model validation

The AG adopted a number of approaches to ensure the credibility and validity of the model. These included scrutiny of the implemented model coding and formulae by two modellers, black-box testing in which the predictive validity of parameter inputs (e.g. that increasing effectiveness of the treatment lowers cost-effectiveness) was assessed, checking the accuracy of all model inputs against the original sources and consultation with clinical experts on key assumptions (see Acknowledgements).

Results of the independent economic assessment

Base-case results

The deterministic and probabilistic fully incremental results of the AG’s base-case analysis (excluding confidential PAS discounts for QuiremScout, sorafenib, lenvatinib and regorafenib) are presented in Table 31. The probabilistic results were based on 20,000 model iterations.

TABLE 31

Fully incremental results of the AG’s base-case analysis

The AG’s base case was based on the following assumptions and data sources:

SIR-Spheres efficacy based on a pooled survival analysis of SARAH¹⁹ and SIRveNIB²¹ data (per-protocol population)
QuiremSpheres and TheraSphere efficacy equal to SIR-Spheres
for patients who received work-up but no SIRT, OS and PFS based on SARAH¹⁹ KM
sorafenib efficacy based on a pooled survival analysis of SARAH¹⁹ and SIRveNIB²¹ data (ITT population)
lenvatinib HR derived from the AG’s NMA (ITT population)
OS and PFS extrapolated using a Weibull model
decision tree transition probabilities estimated using data from the SARAH¹⁹ trial
no downstaging to curative therapy permitted
bilobar treatments performed in two separate procedures
work-up costs from The Christie NHS Foundation Trust elicitation (as per the BTG economic analysis)
health state utilities from the SARAH¹⁹ per-protocol subgroup, based on therapeutic class (SIRT and systemic therapy).

Based on the probabilistic version of the AG model, the three SIRTs are each expected to generate fewer QALYs than sorafenib or lenvatinib, but were associated with higher costs. SIRT generated 0.765 QALYs; this was 0.076 QALYs fewer than generated by sorafenib and 0.060 fewer than by lenvatinib. TheraSphere and SIR-Spheres had very similar total costs, and QuiremSpheres was the most costly owing to the additional costs associated with procurement of QuiremScout.

Figure 19 presents CEACs for the fully incremental results of the AG model. Lenvatinib has the highest likelihood of being cost-effective across any WTP threshold of < £100,000. Assuming a WTP threshold of £30,000 per QALY gained, TheraSphere had an incremental NMB of –£2154, and this was –£2323 for SIR-Spheres. The NMB for QuiremSpheres versus lenvatinib was –£8741. All three SIRTs were dominated by lenvatinib. Disaggregated deterministic results show that just under half of the QALY gain in both groups is accrued in the post-progression health state.

FIGURE 19

Cost-effectiveness acceptability curve for the AG probabilistic base-case analysis.

For results including the confidential PAS discounts for sorafenib, lenvatinib, regorafenib and QuiremSpheres, see Appendix 17.

Sensitivity analyses results

Scenario analyses

Scenario 1: efficacy data from SARAH only

The first scenario analysis explores the effect of using only data from the European SARAH trial¹⁹ to inform efficacy estimates for SIRT and sorafenib, on the basis that these might better represent the patient population and clinical practice in the UK. Deterministic and probabilistic results are presented in Table 32. The probabilistic results are based on 5000 model iterations. As with the AG base case, each SIRT is associated with almost the same number of life-years and QALYs; however, this scenario predicts lower OS (and thus life-years/QALYs) than in the base case, which makes SIR-Spheres marginally cheaper than lenvatinib.

TABLE 32

Assessment Group scenario 1 results: efficacy data from SARAH only

Scenario 2: low tumour burden/albumin–bilirubin 1 subgroup (SARAH)

This scenario explores the use of the company’s preferred post hoc grouping of patients from the SARAH trial¹⁹ as the source of efficacy data for SIRT and sorafenib. Further changes from the AG base case are the use of the higher low tumour burden/ALBI 1 subgroup utilities from the SARAH trial,¹⁹ and the significantly lower proportion of patients who receive work-up but not SIRT (8.1% vs. 18.6%). Note that although Sirtex used a proportion of 2.9% for work-up failures in this population, it was unclear how this figure was reached. Increasing the number of work-up failures, however, increases the cost-effectiveness of SIRT.

This scenario predicts the cost-effectiveness of an optimised decision in which only patients who have a tumour burden of ≤ 25% and a preserved liver function would be eligible to receive SIRT. As there is no equivalent evidence available for lenvatinib, this scenario assumes that the HR between sorafenib and lenvatinib remains the same as in the base-case population.

Table 33 shows that although the systemic therapies were less costly than SIRT in this scenario, SIR-Spheres generated an additional 0.139 QALYs compared with lenvatinib and 0.117 compared with sorafenib in the probabilistic model. This resulted in fully incremental ICERs of £20,926 per QALY gained for TheraSphere compared with lenvatinib, and £119,562 for SIR-Spheres compared with TheraSphere. However, the two technologies were distinguished only by the additional fluoroscopy cost associated with the SIR-Spheres procedure, resulting in very similar NMB at a £30,000 threshold. This is notably the only scenario in which TheraSphere and SIR-Spheres have a positive incremental NMB versus lenvatinib at a WTP threshold of £30,000 (excluding scenario 4). This is illustrated by the CEAC in Figure 20, which shows lenvatinib to have the highest likelihood of being cost-effective up to a WTP threshold of approximately £27,000, at which point it is surpassed by TheraSphere and by SIR-Spheres at a WTP threshold of ≥ £32,000.

TABLE 33

Assessment Group scenario 2 results: low tumour burden/ALBI 1 subgroup

FIGURE 20

Cost-effectiveness acceptability curve for AG scenario 2: low tumour burden/ALBI 1 subgroup.

Results including the confidential PAS discounts for sorafenib, lenvatinib, regorafenib and QuiremSpheres can be found in Appendix 17.

Scenario 3: no macroscopic vascular invasion (SARAH)

This scenario limits the patient population to only those who had no MVI, referred to elsewhere as PVI, at baseline. These patients may be expected to benefit more from SIRT owing to a more favourable positioning and spread of their tumour, and were thus defined as a subgroup of interest in NICE’s scope. As there is no equivalent evidence for lenvatinib, this scenario assumes that the HR between sorafenib and lenvatinib remains the same as in the base-case population.

The probabilistic analysis in Table 34 found all three SIRTs to be dominated by lenvatinib, with a significantly lower NMB than either systemic therapy. Notably, the gap in QALYs produced by SIRT versus sorafenib widened in this analysis versus the base case, implying a reduced benefit of SIRT in this population.

TABLE 34

Assessment Group scenario 3 results: no MVI

Scenario 4: TheraSphere hazard ratio from the Biederman et al. and Van Der Gucht et al. network meta-analysis scenario

The results presented in Table 35 use the HR derived from the AG’s NMA scenario, which included the low-quality retrospective studies by Biederman et al.³⁹ and Van Der Gucht et al.⁴⁰ The patient population in Biederman et al.³⁹ was particularly mismatched with the others included in this analysis, as it included only patients with MVI, which appeared to have a substantial impact on the treatment effect associated with TheraSphere.

TABLE 35

Assessment Group scenario 4 results: TheraSphere HR from Biederman et al. and Van Der Gucht et al. NMA scenario

A HR of 0.46 versus SIR-Spheres was applied for both OS and PFS outcomes for TheraSphere. Based on the probabilistic analysis (5000 iterations), TheraSphere is expected to generate an additional 0.507 QALYs compared with lenvatinib, at an additional cost of £4068, producing an ICER of £8017 per QALY gained, and a NMB of £11,413. TheraSphere was associated with higher costs than SIR-Spheres owing to the increased disease management costs associated with lower mortality, but it also produced an additional 0.566 QALYs, yielding an ICER of £6060 per QALY gained.

Further scenario analyses

Table 36 presents a number of other scenarios on the AG base case that explore the impact of alternative assumptions, including sources of utilities, downstaging to curative therapy, resource use and survival models.

TABLE 36

Further scenario analyses (AG scenarios 5–17)

Scenarios 6 and 10 include the possibility for downstaging; in these scenarios, the distribution of the three liver-targeted treatments was derived from the SARAH trial.¹⁹ Patients who received TACE or radiation therapy were excluded as these would not be permitted options in this population in UK practice. Liver transplant was undergone by 1.09% of SIRT patients and 0.46% of sorafenib patients; 1.63% of SIRT patients and 0% of sorafenib patients underwent liver resection, and 3.26% of SIRT patients and 0.92% of sorafenib patients received ablation therapy.

Only the deterministic results are produced for these analyses.

Table 37 presents the results of the base-case and selected scenario analyses in terms of their effect on the NMB ranking of the five technologies at list price. This shows lenvatinib to be consistently ranked first in terms of incremental NMB, except in those scenarios that use more favourable assumptions in favour of SIRT. As SIRT produces QALYs above the WTP threshold, increasing the proportion of patients who fail work-up (scenario 17) and do not go on to receive SIRT increases its cost-effectiveness, as overall costs are reduced and the more cost-effective QALYs produced on BSC and sorafenib are up-weighted.

TABLE 37

Incremental NMB rankings

Deterministic sensitivity analysis

Results of the DSAs are presented in Figures 21–25 for the AG base-case scenario and the four scenarios presented in Scenario analyses. The tornado diagrams present the 10 most influential parameters in each analysis. SIR-Spheres was compared with sorafenib because sorafenib was considered the most relevant comparator and had direct evidence compared with SIR-Spheres.

FIGURE 21

Tornado diagram: SIR-Spheres vs. sorafenib; base-case analysis (SARAH and SIRveNIB).

FIGURE 25

Tornado diagram: TheraSphere vs. sorafenib; TheraSphere HR from Van Der Gucht et al. and Biederman et al. NMA (scenario 4).

The AG base-case analysis (see Figure 21) was robust to a range of parameters, with the most influential parameters providing a range of NMBs between approximately –£1600 and £1000, with the base-case NMB as –£315. The most influential parameters were the health state utilities, the number of SIRT procedures and the proportion of patients receiving SIRT after work-up. In these scenarios, SIR-Spheres became cost-effective compared with sorafenib for some of the range of values of the parameter (i.e. SIR-Spheres had a positive incremental NMB). However, when the confidential PAS for sorafenib was applied, this was no longer the case.

In scenario 1, with efficacy data based only on SARAH¹⁹, varying the parameters in the DSA had a larger impact on NMB than in the base-case analysis, although the variation remains small (see Figure 22). Similarly to the base-case analysis, the results were most sensitive to health state utilities and SIRT procedures; however, in this analysis, OS for sorafenib and SIR-Spheres was also an influential parameter. There were no scenarios in which SIR-Spheres was estimated to be cost-effective compared with sorafenib.

FIGURE 22

Tornado diagram: SIR-Spheres vs. sorafenib; using SARAH efficacy data (scenario 1).

The most influential parameter in the low tumour burden/ALBI 1 subgroup was OS for both SIR-Spheres and sorafenib (see Figure 23). SIR-Spheres remained cost-effective compared with sorafenib over the range of parameters; however, when the confidential PAS for sorafenib was applied, this was no longer the case.

FIGURE 23

Tornado diagram: SIR-Spheres vs. sorafenib; low tumour burden/ALBI 1 subgroup (scenario 2).

In the ‘no-MVI’ subgroup, the most influential parameters were the health state utilities and OS for sorafenib and SIR-Spheres (see Figure 24). There were are no scenarios in which SIR-Spheres was estimated to be cost-effective compared with sorafenib.

FIGURE 24

Tornado diagram: SIR-Spheres vs. sorafenib; no MVI subgroup (scenario 3).

In Figure 25, TheraSphere was compared with sorafenib. In this scenario, the results of the analysis were robust to the range of parameters, and found TheraSphere to be cost-effective across all scenarios.

Discussion of the independent economic assessment

In the light of the AG’s concerns regarding the relevance of the economic analyses identified in the review of cost-effectiveness studies and highlighted limitations in the economic evaluations developed by BTG and Sirtex, the AG developed a de novo health economic model. The AG model evaluated the three SIRTs and current UK practice for the treatment of advanced HCC in Child–Pugh class A patients ineligible to receive (or who had previously failed) CTT. Results were generated as fully incremental ICERs and in terms of incremental NMB, which allows for easier comparison of ‘dominated’ results with small differences in cost and efficacy. The AG model used a three-state partitioned survival model approach with a decision tree, which determined the proportion of patients who did not continue on to receive SIRT following the work-up procedure. The model utilises all currently available RCT evidence to generate estimates of clinical effectiveness, using data directly drawn from the SARAH¹⁹ and SIRveNIB²¹ trials, and HRs generated in the AG’s NMA.

Based on the AG’s probabilistic base-case analysis at list price, none of the three SIRTs is expected to be cost-effective at any WTP threshold, being more costly and less effective than lenvatinib. When the modelled population was limited to only those with a low tumour burden and preserved liver function, the ICERs for TheraSphere and SIR-Spheres were £17,165 and £18,783 per QALY gained versus the most cost-effective systemic therapy. The most optimistic ICERs were generated in the scenario presented for the low tumour burden and preserved liver function in which downstaging to curative therapy was permitted. In this scenario, the ICERs for TheraSphere and SIR-Spheres decreased to £1440 and £2339, respectively. However, there was no scenario in which SIRT was predicted to be cost-effective at a WTP threshold of £30,000 when confidential PAS discounts were included (see Appendix 17). In all scenarios, QuiremSpheres was not cost-effective compared with other SIRTs owing to higher work-up and acquisition costs (see below for further discussion of QuiremSpheres in relation to the limitations of the model).

The AG scenario 4 (including the Biederman et al.³⁹ and Van Der Gucht et al.⁴⁰ studies) found TheraSphere to be cost-effective versus lenvatinib when the confidential PAS prices were used. However, the AG considers the data used to model comparative effectiveness to be of low quality and inconsistent with the wider body of evidence on the comparative effectiveness of SIR-Spheres and TheraSphere. The AG, therefore, does not consider this scenario to represent a realistic estimate of the relative benefits of TheraSphere.

The results of the AG’s base-case analysis are robust to a wide range of assumptions, reflecting the completeness and quality of the included studies, and the substantial differences seen in costs and QALYs between the SIRTs and current UK practice (including confidential PAS). The AG’s analyses predicted lenvatinib to rank first in terms of NMB in all scenarios (excluding scenario 4), whereas sorafenib was a cost-effective alternative, producing more QALYs at a higher cost. There are a number of differences between the AG model and those presented by the companies, which primarily concern the issues highlighted in the critique of these models in Chapter 5, Review of economic evidence submitted by companies. Strengths of the AG model include (1) all available high-quality RCT data were used to model the outcomes of the most relevant patient population to UK practice, (2) analyses included all appropriate comparators, (3) independent modelling of the costs and outcomes of patients who receive work-up but were ineligible to receive SIRT and (4) preserved randomisation and greater internal consistency with regard to the use of subsequent and curative therapies.

Insurmountable limitations in the evidence base meant that the AG was unable to address the question of the cost-effectiveness of SIRT in patients with early or intermediate HCC. The evidence for TheraSphere and QuiremSpheres in advanced HCC was extremely limited, and a lack of head-to-head evidence prevented a meaningful comparison of SIR-Spheres, TheraSphere and QuiremSpheres with one another in terms of clinical effectiveness. This essentially limits this particular comparison to that of a cost-minimisation, with a full comparison of the cost-effectiveness of SIRT versus sorafenib and lenvatinib. Although it is therefore not possible to discern which of the SIRTs offers the best value for money, the increased cost of the QuiremSpheres work-up procedure meant that it was consistently positioned last by some way in terms of NMB. The structure of the AG model and a lack of supporting evidence on the comparative effectiveness of QuiremSpheres, however, meant that there were no means by which the concept of ‘suboptimal SIRT’, as proposed by Terumo,¹⁰⁴ could realistically be explored. This includes the ostensibly greater selectivity of QuiremScout, and any quantifiable improvement in treatment effect resulting from optimisation of patient selection.

Copyright © Queen’s Printer and Controller of HMSO 2020. This work was produced by Walton et al. under the terms of a commissioning contract issued by the Secretary of State for Health and Social Care. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.

Bookshelf ID: NBK562642

Contents