Systematic review of existing cost-effectiveness evidence
This chapter will explore and review all published studies on the cost-effectiveness of LISA-TRACKER ELISA kits, TNF-α-Blocker ELISA kits and Promonitor ELISA kits for measuring levels of TNF-α inhibitors and of anti-drug antibodies in detail.
Aim
To review all cost-effectiveness studies including any existing models and to identify any suitable data such as resource use, costs, utilities and transition probabilities to help inform our economic model for the evaluation of the cost-effectiveness of LISA-TRACKER ELISA kits, TNF-α-Blocker ELISA kits and Promonitor ELISA kits for measuring levels of TNF-α inhibitors and of anti-drug antibodies in detail.
Methods
Search strategy
A comprehensive search of the literature for published economic evaluations (including any existing models), cost studies and QoL (utility) studies was performed. The systematic search included searching the following electronic databases during December 2014 (from 12 to 17 December 2014):
- MEDLINE (via Ovid) (1946 to Week 3 November 2014)
- MEDLINE In-Process Citations and Daily Update (via Ovid) (11 December 2014)
- EMBASE (via Ovid) (1947 to 15 December 2014)
- NHS Economic Evaluation Database (The Cochrane Library)
- Science Citation Index (Web of Knowledge) (1970–present)
- Cost-effectiveness Analysis Registry
- EconPapers (Research Papers in Economics)
- School of Health and Related Research Health Utilities Database.
The search included terms for CD, anti-TNF-α drugs and the different assay kits, combined with economic and QoL terms. The search was limited to studies published in the English language. The search strategy developed was based on the clinical effectiveness review, with input from a health economist. Details of the full search strategies are provided in Appendix 3.
Inclusion criteria
Only studies meeting the following inclusion criteria were included in the review:
- study type – fully published economic evaluations (including economic models)
- population – people with CD
- intervention – anti-TNF-α drugs (ADA and IFX) and antibody drug testing (LISA-TRACKER ELISA kits, TNF-α-Blocker ELISA kits and Promonitor ELISA kits) for any dosage or treatment regimen
- comparator – standard care treatment: anti-TNF-α drugs (ADA and IFX) for any dosage or treatment regimen
- outcomes – cost-effectiveness or cost–utility studies reporting outcomes as clinical effectiveness measures or utility measures [utility, EQ-5D, Short Form questionnaire-6 Dimensions score or quality-adjusted life-years (QALYs)].
Exclusion criteria
Studies meeting the following exclusion criteria were excluded from the review:
- non-English-language publications
- studies in the health areas where these anti-TNF-α drugs have also been used, such as UC, rheumatoid arthritis, psoriasis and tuberculosis.
Assessment of eligibility and data extraction
All retrieved records (citations and abstracts) were collected in a specialist database (EndNote) and duplicate records were identified and removed. Two reviewers independently reviewed titles and abstracts to identify potentially relevant papers for inclusion. Any discrepancies were resolved by discussion. See Appendix 13 for the table of full-text studies excluded with reasons.
Data extraction was carried out in two stages by one reviewer using standardised data extraction sheets (see Appendix 14) and was then checked by a second reviewer. Stage 1 considered all eligible studies (fully published economic evaluations including any economic models) and stage 2 considered studies assessed for usefulness for populating the economic model. Data extracted during stage 1 included the following:
- study details – author names, source of publication, language and publication type
- baseline characteristics – population, intervention, comparators, outcomes and type of economic evaluation
- methods – target population and subgroups, setting and location, study perspective, time horizon, discount rate, measurement of effectiveness, measurement and valuation preference-based outcomes, resource use and costs, currency, price date and conversion, model type, assumptions and analytical methods
- results – study parameters, incremental costs and outcomes and characterising uncertainty
- discussion – study findings, limitations, generalisability and conclusions
- other – sources of funding, conflicts of interest and comments.
Quality assessment
The quality of full economic evaluation studies that were identified was assessed using the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) checklist (see Appendix 15) by one reviewer and cross-checked by a second reviewer. The CHEERS checklist comprises six dimensions: title and abstract, introduction, methods, results, discussion and other. Under these dimensions, a series of questions check whether or not the criteria have been clearly reported. Any studies containing an economic model were further assessed using the framework for the quality assessment of decision-analytic modelling by Philips et al.149 (see Appendix 15). The Philips checklist contains two main dimensions: structure of the model and data used to parameterise the model. Under these dimensions several questions assess whether or not the criteria has been clearly reported.
Data synthesis
Information extracted from the included studies were summarised and tabulated. Findings from individual studies were compared narratively.
Results
In developing the economic model, we have consulted the previous technology appraisal guideline and HTA report by Dretzke et al.5 even though this work did not include any assay kits for measuring levels of TNF-α inhibitors and of anti-drug antibodies. The aim of this diagnostic assessment review, as specified by NICE, was to build upon this previous work. The next section contains a summary of this previous HTA report and then the results of the cost-effectiveness review including quality assessment will be outlined.
Summary of the Health Technology Assessment report by Dretzke et al.5
The main aim of this HTA report was to assess the cost-effectiveness of anti-TNF-α drugs in the management of adult patients with moderate to severe CD in the UK NHS. The authors described induction therapy as the use of anti-TNF-α therapy with the aim of achieving remission (a repeated reinduction treatment was considered, rather than a one-off induction therapy) and maintenance therapy as the use of anti-TNF-α therapy to maintain remission in patients who have responded (and continue to respond) to anti-TNF-α therapy when in relapse. Response by the authors was defined as remission within 8 weeks.
The authors developed a Markov model from a NHS and Personal Social Services (PSS) perspective to estimate the incremental cost per QALY gained for both ADA and IFX (anti-TNF-α therapy) compared with standard care. Mortality was not included in the model, as the authors found no difference in the mortality rates that were reported in the clinical trials reviewed and, therefore, felt that a lifetime horizon would not improve the precision of the cost-effectiveness estimate. Instead, the time horizon for the model was 1 year and the cycle duration was 4 weeks. The model for both induction and maintenance therapy started with a cohort of patients in the standard care refractory relapse health state. The model had four main health states and at any time, and on any given treatment, a patient was in remission, in relapse, undergoing surgery or in post-surgery remission.
Transition probabilities for the standard care health states were based on Silverstein et al.150 Transition probabilities for both the induction and maintenance model were assigned a treatment effect by using relapse to remission probabilities from RCT evidence; however, for the maintenance model there was a lower remission to relapse rate.
The majority of utility values for the model were based on the study by Gregor et al.,20 which used the time-trade off measure to estimate the health-related QoL in CD. A utility value for surgery was not available in the published literature; therefore, it was assumed that the average utility value for surgery would be equivalent to EQ-5D health state 22222, with a utility weight of 0.516.
The direct costs to the NHS were the sum of the anti-TNF-α costs and type-specific health-state costs. The costs of anti-TNF-α therapy, both induction and maintenance, were derived from the BNF (2007/8),151,152 and administration costs were also included for IFX. Type-specific health-state costs included costs for surgery, which were modelled as the cost of inpatient IBD interventions, and post-surgery remission costs, which were based on outpatient surgical gastrointestinal follow-up. Moderate and severe relapse costs were modelled as the cost of IBD outpatient major and intermediate interventions. Relapse costs were based on a gastrointestinal admission to hospital. Remission costs were modelled using literature. The majority of health-state unit costs were obtained from the NHS reference cost database (NHS Reference Costs 2005 to 2006153).
Incremental cost-effectiveness ratios (ICERs) and cost-effectiveness acceptability curves were presented. One-way sensitivity analyses and probabilistic sensitivity analyses (PSAs) using 10,000 simulations were conducted to characterise uncertainty in the model.
For induction therapy for severe CD, both ADA and IFX dominated standard care (i.e. they were cheaper and more effective). For maintenance therapy for severe CD, neither drug was cost-effective (well above NICE thresholds). For moderate CD, for maintenance therapy for both drugs and induction therapy for IFX, these were not cost-effective (well above NICE thresholds); however, for induction therapy, ADA dominated standard care.
Sensitivity analysis showed that, in patients with severe disease, IFX induction treatment was cost-effective relative to maintenance treatment and standard care in > 99% of cases at all points up to £100,000 per QALY. Likewise, ADA induction treatment was found to be cost-effective relative to maintenance treatment and standard care for thresholds up to £100,000 per QALY.
The key limitations of this model was a short time frame (1-year time horizon); the exclusion of death from the model; no randomised controlled data available for maintenance therapy; and the use of Silverstein et al.150 data for transition probabilities, which inherently had their own problem (i.e. surgery rates were higher and relapse rates much lower than in routine practice).
Search results for objective D
The literature search identified 2466 records through electronic database searches and other sources. After removing duplicates, 1527 records were screened for inclusion. On the basis of a title and abstract sift only, 1518 records were excluded. The remaining nine records were subjected to full-text screening. A further five articles5,26,154–156 were excluded at the full-text stage, as these studies did not use assay kits to measure levels of TNF-α inhibitors and anti-drug antibodies. The literature search identified four studies73,123,124,157 of the cost-effectiveness of different assay kits for measuring levels of TNF-α inhibitors and of anti-drug antibodies (Figure 27).
Overview of included studies
The literature search identified four studies73,123,124,157 that met our inclusion criteria (studies looking at the cost-effectiveness of different assay kits for measuring levels of TNF-α inhibitors and of anti-drug antibodies) and were reviewed. In the following sections we present an overview of the included studies by population (responders and those showing LOR) of interest.
Vande Casteele et al.73
Vande Casteele et al.73 aimed to determine whether or not concentration-based IFX dosing was more cost-effective than clinically based IFX dosing. These authors conducted a RCT and assigned people with moderate to severe CD or UC to receive concentration-based or clinically based IFX dosing. Included patients were those who were treated with maintenance IFX therapy for at least 14 weeks and who had a stable clinical response. These authors defined clinical response as being ‘symptom-free (full responder) or having clinical improvement with an obvious decrease of disease activity but with clinical symptoms still present (partial responder)’.73 Patients eligible for the study were dose optimised until IFX trough concentrations between 3 and 7 µg/ml were reached. At the assessment of each trough concentration using an in-house-developed ELISA, the dosing regimen was changed to reflect the proposed treatment algorithm, until patients had a trough concentration between 3 and 7 µg/ml. Briefly, depending on IFX trough concentration, patients received an increase dose of IFX treatment, no dose adaptation or a decrease in IFX treatment. The study was prospective and was undertaken at a tertiary referral centre in Belgium. The study was conducted from the perspective of the third-party payer and the time horizon was 1 year. The EQ-5D was used to calculate QALYs, and any differences in baseline utility scores were adjusted for by the use of a multiple regression approach. Resource use and costs were not reported in detail, apart from the drug costs per patient per year. All costs were expressed in euros in 2012 prices. The base-case results were expressed as an ICER based on the outcome of cost per QALY gained. Uncertainty in incremental QALYs and costs was determined by non-parametric bootstrapping consisting of 1000 iterations and plotted onto a cost-effectiveness plane. The base-case results demonstrated that concentration-based dosing was slightly less effective (0.8227 vs. 0.8421) and less costly (€20,700 vs. €21,000) than clinically based dosing, but overall differences were small.
Steenholdt et al.123
Steenholdt et al.123 assessed the cost-effectiveness of receiving treatment based on serum concentrations of IFX and IFX antibodies at the time of IFX treatment failure in accordance with the algorithm (for further details of the algorithm, see Chapter 3, Objective B: description of algorithms prescribing patient management following test outcomes for drug and/or anti-drug antibody levels) compared with receiving IFX at an increased dose frequency of 5 mg/kg every 4 weeks. The study included patients who experienced failure of IFX treatment while on maintenance treatment. Failure of IFX treatment was defined in the study as recurrence of active disease with a CDAI score of ≥ 220 and/or a minimum of one draining fistula. Serum IFX and IFX antibodies were analysed using RIA. Samples were stored and further analysed using ELISA and HMSA after study completion. The study was a single-blind RCT set in six Danish hospitals. Study perspective was not clearly stated. Cost-effectiveness was assessed at 12 weeks, with visits scheduled at 0, 4, 8 and 12 weeks. Clinical effectiveness was based on clinical response rates, which is regaining response or continuing to lose response to IFX therapy. Resource use and costs were based on IFX doses and all inpatient and outpatient contacts in hospitals, which also included diagnostic and treatment procedures that were recorded in the National Patient Registry database. Costs were reported in Danish krone and converted to euros in 2012 prices. The base-case results were expressed as cost per ITT and PP population. Costs were compared using arithmetic means and were assessed by non-parametric bootstrapping. One-way sensitivity analyses of key primary and secondary end points were conducted. The base-case results showed that costs were significantly lower in the algorithm group than in the IFX intensification group in both the ITT and PP population.
Steenholdt et al.124
In follow-up to their study published in 2014,123 Steenholdt et al.124 extended the time horizon to 1 year to assess the long-term costs and clinical outcomes of treatment of CD in patients with LOR to IFX maintenance therapy using a proposed algorithm compared with intensified IFX treatment. Serum IFX and IFX antibodies were analysed using RIA, and were further analysed using ELISA and HMSA after study completion. IFX levels were classified as therapeutic or subtherapeutic (≥ 0.5 µg/ml and < 0.5 µg/ml, respectively); IFX antibodies were classified as detectable or undetectable. Costs were assessed at the 20-week scheduled trial visit and again at 1 year. Clinical outcomes were assessed after 20 weeks. Costs were reported in Danish krone and converted to US dollars in 2012 prices. The base-case results were expressed as cost per ITT population, cost PP population, cost PP population completion at end of trial week 12 and cost PP population completion at end of follow-up week 20. Sensitivity analyses on inclusion of estimated costs for administering biologic agents, use of actual IFX dosing and a reduction in the price of biologic agents of 3.5% and 7% were conducted to determine the robustness of the base-case results. At the 20-week follow-up, the costs were significantly lower in the algorithm group than in the IFX intensification group, and this differential was maintained throughout the 1-year study period. The base-case results, in terms of ITT for patients randomised to the algorithm group, showed costs of approximately US$11,900 for one patient at the 20-week follow-up, compared with US$22,100 at the 1-year follow-up. Among patients randomised to the IFX intensification group, the corresponding costs were US$17,200 and US$29,100, respectively. In terms of PP, among those randomised to the algorithm and the IFX groups, costs at the 20-week follow-up were approximately US$8700 and US$17,200, respectively, whereas at the 1-year follow-up the costs were approximately US$15,700 and US$29,100, respectively. The results from the sensitivity analyses were similar to the base-case results.
Velayos et al.157
Velayos et al.157 used a decision-analytical model to assess the cost-effectiveness of a testing-based strategy with an empiric dose escalation strategy for patients with moderate to severe CD who become unresponsive to therapy with IFX. These authors used the algorithm proposed by Afif et al.56 to form the basis of the testing-based strategy, whereas the empiric dose escalation strategy was informed by the consensus statement from the World Congress of Gastroenterology.157 The study was conducted from the perspective of the third-party payer and a time horizon of 1 year, with a 4-week cycle length. Outcomes were reported as QALYs. QALYs gained were derived based on utility values obtained from the study undertaken by Gregor et al.20 Briefly, utility scores for 180 individuals with CD were obtained using various elicitation methods (standard gamble, time trade-off or visual analogue scale). Gregor et al.20 suggested that the standard gamble technique reflected the true value for health states related to patients with CD. Resource use and costs included the cost of interventions – IFX, ADA, certolizumab pegol, natalizumab and surgery – and the cost of diagnostics – anti-IFX antibody/serum IFX measurement, computerised tomography enterography and colonoscopy. Costs were expressed in US dollars, but the price year was not reported. The base-case results were expressed as an ICER based on the outcome of cost per QALY gained. Extensive one-way sensitivity analyses were conducted and populated with data to run the model probabilistically to represent the uncertainty in key model input parameters. The base-case results demonstrated that the testing strategy was cheaper and marginally more effective, thus dominating the empiric strategy. Results from the sensitivity analyses showed that empiric strategy was less expensive when the cost of surgery was fivefold more than in the base case. In addition, reducing the utility value for the health state of the ‘mild/minimal inflammation with symptoms’ from 0.80 to 0.70 resulted in marginally greater QALYs in the empiric group than in the testing-based group. Furthermore, increasing the cost for testing 25-fold resulted in the testing-based strategy being more expensive than the empiric strategy. Results from the PSA showed that the testing-based strategy has approximately 69% probability of being cost-effective compared with empiric dose escalation at a willingness-to-pay of US$50,000 per QALY.
Comparison of the included studies
All four studies included in this review have been summarised in Table 27. Three studies were based on RCTs73,123,124 and only one study157 presented an economic model. Of the RCTs, two123,124 were conducted in Denmark and one73 was conducted in Belgium. All four studies73,123,124,157 conducted cost-effectiveness analyses: Vande Casteele et al.73 compared concentration-based with clinician-based dosing; Steenholdt et al.123,124 compared IFX treatment failure using a treatment algorithm compared with IFX dose increasing; and Velayos et al.157 compared a testing-based strategy with an empiric dose escalation strategy. All studies73,123,124,157 clearly stated the type of assay used to analyse serum levels and antibodies to anti-TNF-αs. Two studies123,124 used RIA in the base case, one study73 used an assay developed in house and the remaining study157 used a PROMETHEUS ELISA.
The patient populations for three studies73,123,124 included eligible patients with moderate to severe CD, whereas the study by Vande Casteele et al.73 included patients with UC. The study perspective was not reported in two studies,123,124 whereas the other two studies73,157 conducted the analysis from a third-party payer perspective. The time horizon varied from 12 weeks to 1 year. Steenholdt et al.123 based their analysis on a 12-week horizon, whereas the other three studies73,124,157 used a 1-year time horizon to estimate the cost-effectiveness of the different strategies.
In two studies,73,157 outcomes were reported as cost per QALYs gained. Vande Casteele et al.73 used the EQ-5D measure to estimate QALYs, whereas Velayos et al.157 did not explicitly report how the QALYs were estimated, except to say that they were obtained from a secondary source.20 The two studies by Steenholdt et al.123,124 reported outcomes in terms of cost per ITT and cost PP population.
Three studies123,124,157 provided quite a comprehensive breakdown of resource use and costs, whereas the study by Vande Casteele et al.73 did not elaborate on resource use, apart from the drug costs. Three studies73,123,124 reported costs in 2012 prices, whereas Velayos et al.157 did not report the price year explicitly; however, we assumed that costs are most likely to be in 2012 prices, as the study was published in 2013.
No studies conducted discounting for either costs or benefits as the time horizon for these studies was ≤ 1 year.
The results and conclusions reported differed between studies, Vande Casteele et al.73 demonstrated that concentration-based dosing was slightly less effective and less costly than clinically based dosing, but overall differences were small. Steenholdt et al.123 showed that the intervention based on the algorithm achieved similar clinical and life quality outcomes to dose intensification, but at a lower cost at 12 weeks. These results were maintained at both 20 weeks and 1 year.124 Velayos et al.157 showed that the testing strategy was cheaper and more effective than the empiric strategy.
All four studies73,123,124,157 conducted sensitivity analyses to deal with uncertainty around key parameters. The sensitivity analyses ranged from the most simplistic one-way sensitivity analyses123,124 to the more sophisticated probabilistic analyses.157
Quality assessment
We present, in Appendix 15, a summary of the reporting quality of the studies included in the current review against the CHEERS checklist.158 Using a 25-point CHEERS checklist, one article73 did not identify the study as an economic evaluation in the title. All studies provided background information to the study and clearly outlined the objectives of the study. Two studies73,157 reported the viewpoint of the economic analysis. All studies described the comparators fully and reported the time horizon. However, because of the short time horizon, no studies conducted discounting of costs and benefits. In addition, the choice of health outcomes was well reported by all four studies;73,123,124,157 however, only one study73 reported how these health states were valued. Resource use and costs were well reported in three studies123,124,157 apart from that by Vande Casteele et al.,73 who described only the drug costs. The majority of the studies73,123,124 conducted an economic analysis alongside a RCT, whereas one study157 developed an economic model. In terms of analytical methods, study parameters, incremental costs and outcomes and uncertainty were well reported by all four studies. Limitations were provided by all four studies and generalisability was only partially reported by three studies.123,124,157
From the studies identified, one157 conducted a model-based economic analysis to determine whether or not a testing-based strategy was more cost-effective than an empiric dose escalation strategy. We present, in Appendix 15, a summary of the reporting quality of this study against Philips’s checklist.149 In general, Velayos et al.157 conformed to best practice for reporting model-based economic evaluations in terms of clearly stating the decision problem, adequately outlining the objectives, clearly stating the viewpoint of the analysis and describing the model structure, which represented the clinical pathway that patients with CD may follow. Time horizon and cycle length were stated and justified. In terms of the data required to populate the model, Velayos et al.157 adequately provided references, but they were unclear on the choices made between data sources and the quality of information used in the model. In addition, it was unclear whether or not any expert opinion had been used when choosing baseline information for the model. The other limitations identified were the lack of explanation of pre-model analysis (e.g. calculation of transition probabilities, and methods and assumptions used to extrapolate short-term results into final outcomes) and the omission of half-cycle correction.
Discussion and conclusion
The evidence available on the cost-effectiveness of LISA-TRACKER ELISA kits, TNF-α-Blocker ELISA kits and Promonitor ELISA kits for measuring levels of TNF-α inhibitors and of anti-drug antibodies appears to be limited. We identified four cost-effectiveness analyses,73,123,124,157 which comprised three economic analyses conducted alongside clinical trials and one model-based economic analysis.
The majority of the populations included in these studies had moderate to severe CD and were considered responders to IFX maintenance treatment. Studies (n = 2) mainly used RIA kits to analyse serum levels and antibodies to anti-TNF-αs. We appraised these analyses against frameworks for best practice for reporting economic evaluation and economic modelling. In general, all studies provided background information on the decision problem, clearly outlined the objectives of the study, adequately described and justified the choice of comparators and reported the time horizon. In addition, Velayos et al.157 clearly stated the viewpoint of their model-based economic analysis and outlined the model structure. These studies all provide useful information in this developing area, but are subject to limitations. First, the definition for responder was not clear and it varied between studies. In addition, the definition of patients with moderate to severe CD varied across studies. Second, owing to the small sample sizes, the studies may not be reflective. Third, the short time horizon may not capture the longer-term costs and benefits of the use of testing to monitor serum anti-TNF-α levels and antibodies to anti-TNF-αs. Fourth, the method used to choose between data sources and the quality of information used in the model was unclear. Of the two studies73,157 that reported their outcomes in terms of cost per QALY, only one157 reported the generic preference-based measure used to estimate QALYs. This highlights a lack of transparency of the information used in the model. Other concerns relate to the lack of justification for the 4-week cycle length and the lack of transparency on how transition probabilities were obtained and derived in the modelling study by Velayos et al.157 and, in the case of the study conducted by Vande Casteele et al.,73 the lack of detail on the resource use and costs.
In summary, all of these studies indicated that a testing strategy might be less costly than alternatives with variable small effects on effectiveness, some indicating small reduced benefits and some small increased benefits. Use of standard checklists suggested that all the studies are subject to some limitations.
In Developing the model structure, we outline the development of economic models to determine the cost-effectiveness of various assays to inform on the treatment algorithm for patients who are considered responders and patients with LOR.
Considerations of using the former Health Technology Assessment model by Dretzke et al.5 to inform the current model structure
The previous HTA model5 used natural history data, which are now outdated. The current model for the standard care arm is restricted to starting with IFX (through lack of data for ADA) but otherwise adopts the general approach used in the HTA model but using updated natural history data (for surgery, for maintenance of response, for dose escalation and for other minor parameters, together with more recent clinical expert advice). Clearly the HTA model structure is not easily transferable to the current intervention arm, as the latter requires considerable added complexity because it is based on drug and anti-drug antibody testing; however, this arm conforms to the HTA approach and is designed for comparison with standard care on IFX.
Health economic methods
Objective
To assess the cost-effectiveness of employing anti-TNF-α and anti-TNF-α antibody monitoring with LISA-TRACKER ELISA kits, TNF-α-Blocker ELISA kits and Promonitor ELISA kits in patients with CD compared with standard care.
Standard care for patients during maintenance of disease (responders) is shown in Figure 28.
Standard care of patients with CD may vary across hospitals in the UK. Based on expert clinical input, we assumed that patients categorised as responders will continue to receive IFX maintenance therapy every 8 weeks until they lose response. Patients who lose response will receive an increased dose of IFX. Patients will either respond to this increased dose or continue to exhibit LOR, in which case they will receive another agent in addition to their current treatment. Patients who receive another agent may regain response or continue to exhibit LOR, in which case their anti-TNF-α treatment will be changed. Patients who do not respond a new a anti-TNF-α treatment will be considered for surgery. We have assumed that patients who respond to treatment will remain on that treatment until they lose response. We assume that patients who are in the post-surgery health state might receive various treatments (anti-TNF-α, a combination of anti-TNF-α and immunosuppressant or no treatment). Patients who experience LOR post surgery are expected to follow the standard care treatment pathway as for responders entering the model who subsequently lose response, that is they will receive an increased dose of IFX and follow the same treatment regime until they require repeat surgery.
Developing the model structure
We developed a Markov model using TreeAge Pro 2013 software program (TreeAge Software, Williamstown, MA, USA). The model was developed with clinical input, and represents the clinical pathway patients would undergo while being treated for moderate to severe CD. The illustrative model structures for responders and for those who lose response are shown in Figures 29 and 30, respectively. More detailed decision trees on the patient pathways can be found in Appendix 16. In the models, we compared concurrent and reflex testing conducted every 3 months with standard care for responders and those who experience LOR:
- standard care
- concurrent testing – testing for TNF-α inhibitor levels and antibodies to TNF-α inhibitors
- reflex testing – testing for TNF-α inhibitor levels followed by testing of antibodies to TNF-α inhibitors depending on the drug test level.
The NICE guidance on model-based economic analyses suggests adopting a time horizon long enough to capture the costs and effects of an intervention; normally a lifetime horizon because chronic conditions may reduce life expectancy.5 To our knowledge, no clinical trials have provided evidence of significant difference between testing and standard regimens in CD mortality.5 Hence, we assumed a 10-year time horizon with 4-week cycle lengths to be appropriate to capture all benefits of testing and treatment.
Table 28 shows the health states required for the responder and LOR models.
In the following sections we discuss the testing strategies (concurrent and reflex testing) to be compared in both models (responders and patients with LOR).
Concurrent testing
In the concurrent testing strategy, patients undergo tests for serum anti-TNF-α levels and antibodies to anti-TNF-α simultaneously, and once the test results are available follow the proposed algorithm. Patients are classified, on the basis of their test results, into one of four groups: drug absent and antibodies present, drug absent and antibodies absent, drug present and antibodies present, or drug present and antibodies absent. Alternatively, patients may be categorised according to levels of drug regardless of antibody levels (e.g. as in the TAXIT trial73). Details of test results and proposed algorithms from Steenholdt et al.123 for patients with LOR and from Vande Casteele et al.73 for responders are presented in Chapter 3, Objective B: description of algorithms prescribing patient management following test outcomes for drug and/or anti-drug antibody levels.
Responder
Based on the results from concurrent testing in the responder group, various treatment options may be adopted depending on the treatment algorithm used. In the model the treatment options are based on those used in the TAXIT study,73 the only clinical study of an implemented and defined algorithm for responders:
- if drug is absent and antibodies are present in a concentration > 8 mg/ml, patients receive a switch in TNF-α inhibitor
- if drug is absent and antibodies are present in a concentration < 8 mg/ml, patients receive an increased dosage of current treatment (i.e. IFX dose to 10 mg/kg every 8 weeks)
- if the drug is present (there is no need to measure antibodies), and depending on the trough levels, patients would have either a decrease in the dosing interval (if trough level below the target range), no dose adaptation (if trough level is within the target range) or an increase the dosing intervals (if trough level is above the target range).
Following adoption of these algorithm treatments, patients may remain responders, lose response (move to the LOR health state) or die.
Loss of response
After LOR to anti-TNF-α, testing and algorithm treatments are based on those used by Steenholdt et al.123 in patients who lost response to anti-TNF-α (IFX); this is the only clinical study of implementation of an algorithm for patients with lost response:
- Drug absent and antibodies present – patients would receive a switch in TNF-α inhibitor.
- Drug absent and no antibodies – patients would receive an increase dosage of current treatment.
- Drug present and antibodies present – we have assumed that patients will either have symptoms not requiring surgery and discontinue anti-TNF-α treatment or have active symptoms that require surgery. Patients in the former group would discontinue maintenance treatment and move to the LOR health state (discontinuation of anti-TNF-α) and receive best supportive care. Patients who develop active symptoms that require surgery move to the post-surgery health state or could die.
- Drug present and no antibodies – the pathway for patients with drug and antibodies present is identical to the pathway for patients with drug present without antibodies.
As a result of the treatment algorithm, patients may remain with LOR, regain response or could die.
Loss of response health state (discontinuation of anti-tumour necrosis factor alpha)
Patients who occupy this health state are those who have discontinued anti-TNF-α maintenance treatment and are receiving best supportive care. As in the LOR health state (see Concurrent testing), we have assumed that patients who remain in this health state have symptoms of CD that do not require surgery. Patients who develop active symptoms that require surgery move to the post-surgery health state or could die.
Regain response health state
Patients who move to the ‘regain response’ health state are tested for drugs and antibodies concurrently. Here we have assumed that they would follow the same treatment algorithm as a patient who was classed as a responder (see the TAXIT study73 algorithm in Chapter 3, Objective B: description of algorithms prescribing patient management following test outcomes for drug and/or anti-drug antibody levels). As a result of the treatment algorithm, patients can remain in the regain response health state, lose response (move to the LOR health state) or could die.
Post-surgery (remission) health state
For patients who move to the post-surgery health state, treatment options are an anti-TNF-α, an immunosuppressant, a combination of an anti-TNF-α and an immunosuppressant or no treatment. Patients who are receiving an anti-TNF-α or a combination of anti-TNF-α and an immunosuppressant can regain response or lose response. For patients who regain response or who lose response, we have assumed that the pathway is similar to patients in the regain response health state (see Concurrent testing, Regain response health state) or the LOR health state (see Concurrent testing, Loss of response), respectively. Patients who are receiving immunosuppressants or no treatment could remain in the post-surgery health state until further surgery is required or die.
Reflex testing
In the reflex testing strategy, patients would receive a test to analyse serum anti-TNF-α levels. As a result of testing, two test outcomes are likely: drug absent or drug present. Based on the drug result, patients would undergo further testing for the presence or absence of antibodies. In this section we outline the health states and the pathways for patients undergoing reflex testing for both responder and LOR models. No study was identified that tested an algorithm for reflex testing. The algorithm followed in the model was therefore based on that of the TAXIT73 trial for responders and the Steenholdt et al.123 algorithm for patients with LOR using concurrent testing. Further details of test results and proposed algorithms are presented in Chapter 3, Objective B: description of algorithms prescribing patient management following test outcomes for drug and/or anti-drug antibody levels.
Responder
Based on the results from reflex testing in the responder group, various treatment options are available:
- If drug is absent, test for antibodies – patients with antibodies present would receive a switch in TNF-α inhibitor. Patients with no antibodies would receive an increase dosage of current treatment (i.e. IFX dose to 10 mg/kg every 8 weeks).
- If drug is absent and there are no antibodies – patients would receive an increase dosage of current treatment (i.e. IFX dose to 5 mg/kg every 4 weeks).
- If the drug is present and depending on the trough levels – patients would have a decrease in the dosing interval (if trough level below the target range), no dose adaptation (if trough level is within the target range) or an increase in the dosing intervals (if trough level is above the target range).
As a result of the treatment algorithm, patients could remain responders, lose response (move to the LOR health state) or could die.
Loss of response
- Drug absent and antibodies present: patients would receive a switch in TNF-α inhibitor.
- Drug absent and no antibodies: patients would receive an increased dosage of current treatment.
- Drug present and antibodies present: we have assumed that some patients will have symptoms not requiring surgery and discontinue anti-TNF-α treatment or have active symptoms that require surgery. Patients in the former would discontinue maintenance treatment and move to the LOR health state (discontinuation of anti-TNF-α) and receive best supportive care. Patients who develop active symptoms that require surgery move to the post-surgery health state or could die.
As a result of the treatment algorithm, patients could remain in the LOR state, regain response or die.
Loss of response health state (discontinuation of anti-tumour necrosis factor alpha)
Patients who occupy this health state are those who have discontinued anti-TNF-α maintenance treatment and who are receiving best supportive care. As in the LOR health state (see Reflex testing, Loss of response), we have assumed that patients who remain in this health state have symptoms of CD that do not require surgery. Patients who develop active symptoms that require surgery move to the post-surgery health state or could die.
Regain response health state
Those patients who move to the regain response health state would receive reflex testing for drug levels and, if required, testing for antibodies to anti-TNF-α. We have assumed that they would follow the same treatment algorithm for patients categorised as responders (see TAXIT study73 algorithm in Chapter 3, Objective B: description of algorithms prescribing patient management following test outcomes for drug and/or anti-drug antibody levels). As a result of the treatment algorithm, patients can remain in the regain response health state, lose response (move to the LOR health state) or could die.
Post-surgery (remission) health state
For patients who move to the post-surgery health state, the treatment options are to receive an anti-TNF-α, immunosuppressant, a combination of anti-TNF-α and an immunosuppressant or no treatment. Patients who are receiving an anti-TNF-α or a combination of anti-TNF-α and an immunosuppressant can regain or lose response and follow the same pathways as outlined in Reflex testing, Regain response health state and Reflex testing, Loss of response. For patients who are receiving immunosuppressants or no treatment, the modelled options are to remain in the post-surgery health state until further surgery is required or to die.
Model assumptions
A number of assumptions were required to develop a workable model structure to enable the analyses to be undertaken. These assumptions are:
- In our base case, the model starts with a hypothetical cohort of 30-year-olds with moderate to severe CD.
- Patients were assumed to have received intravenous infusions of 5 mg/kg IFX at weeks 0, 2 and 6. Here we assumed that patients weighed > 70 kg.
- Patients who regained response have the same utility as those who are considered to be responders.
- We have assumed that patients with CD are not at increased risk of dying from the disease, and that there is no difference in mortality between testing and standard care. However, in the case of patients who have undergone surgery, the model assumes an increased risk of 0.0015 of dying as a result of the procedure.
- Treatment effects for patients receiving dose escalation (from 5 mg/kg to 10 mg/kg IFX) and a decreased interval (from 8 weeks to 6 weeks) are the same.
- Patients who are categorised as responders and who have trough concentration within the range that the treatment algorithm suggests receive no dose adaptation.
- In the base case we have assumed transition probabilities to be the same as standard care and used those derived from Juillerat et al.159
- Patients who remain in the LOR health state (discontinuation of anti-TNF-α) have symptoms of CD that in time may require surgery. Patients will receive best supportive care until the development of active symptoms necessitating surgery.
Data required for the model
The model was populated with clinical information from the current clinical effectiveness review and supplemented with information from secondary sources. Information required to parameterise the model included proportions, transition probabilities, resource use and costs, and utilities.
Proportions
The proportions of patients required to populate various model decision tree branches were obtained from secondary sources [e.g. management studies described in Chapter 3, Objective C1: clinical studies evaluating drug monitoring for the management of Crohn’s disease patients (management studies)] and, when such data were lacking, from clinical input. Proportions that were estimated included partitioning of patients by presence or absence of IFX and of antibodies to IFX in responders and in those with LOR; partitioning of responders according to defined IFX trough levels; and partitioning by treatment options following surgery.
Table 29 summarises the partitioning of IFX responders based on the study of Imaeda et al.,99 discussed in Chapter 3, Analysis of correlation studies of tumour necrosis factor alpha/anti-drug antibodies level and response, that used concurrent monitoring for the absence or presence of IFX and antibodies to IFX.
The proportions of IFX responders with various trough levels of IFX were based on Vande Casteele et al.73 (discussed in Chapter 3, Vande Casteele et al.: Trough level Adapted infliXImab Treatment study73). These authors screened a cohort of patients with IBD who were receiving maintenance IFX treatment, and further categorised patients by drug concentration based on test result. Drug levels < 3 µg/ml were considered below the target range, levels between 3 and 7 µg/ml were considered within range and those > 7 µg/ml were above target range. Table 30 shows the proportions of responders with different trough drug levels and the proportions of responders derived from this study.
The partitioning of IFX patients with LOR according to concurrent test monitoring of IFX and antibodies to IFX was based on information obtained from Steenholdt et al.123 Table 31 summarises these proportions.
Patients who have undergone surgery may receive post-operative treatment to maintain remission. These options include an anti-TNF-α, an immunosuppressant, a combination of an anti-TNF-α and an immunosuppressant or no treatment. Table 32 shows these proportions based on the study of van der Have et al.160
Table 33 summarises the proportions of IFX responders based on the study by Imaeda et al.,99 in which reflex testing was used to test for the absence or presence of IFX. In patients in whom IFX was present, we used the proportions according to IFX trough levels based on the Vande Casteele et al.73 study, as shown in Table 30.
The partitioning of IFX patients with LOR according to reflex test monitoring of IFX was based on information obtained from Steenholdt et al.123 Table 34 summarises these proportions.
Time-to-event transition probabilities
Table 35 summarises the transition probabilities for time-to-event outcomes used in the models.
Transition probabilities from time-to-event studies
The transition probabilities provided in Table 35 are mainly derived from analyses of various time-to-event studies judged to provide relevant information consistent with the model structure. Further details regarding the derivation of, and justification for, these are provided in Appendix 17.
Resource use and costs
The resource use and costs included were those directly incurred by the NHS. The costs of reagents for monitoring trough concentration of anti-TNFs and of antibody-measuring kits, treatment for CD and laparoscopic ileocolic resection were all included in the analysis. Resource use and costs associated with occupying all health states except dead were also included. Unit costs are presented in Table 36. The majority of the cost information used in the analyses was obtained from secondary sources.
The costs of monitoring kits for IFX and for antibodies to IFX were obtained from Theradiag/Alpha Laboratories. In Appendix 18, we present a breakdown of the resource use and costs associated with monitoring kits for IFX and antibodies to IFX. In the models, we used a cost of £39.58 per person for concurrent testing for IFX and antibodies to IFX. In the case of reflex testing, we used a cost of £43.48 for patients in whom testing for IFX was followed by testing for antibodies because the results of the former were negative. For patients in whom a test for IFX was positive, no subsequent antibodies monitoring test was undertaken and, hence, we used a cost of £21.74.
The costs of maintenance treatment were obtained from the BNF (2013/14).166 The costs of treatment associated with the induction phase (weeks 0–6) were not included. IFX treatment costs comprised its acquisition and administration costs. In the base case, we assumed that patients receiving maintenance therapy have received infusions of IFX 5 mg/kg every 8 weeks and that patients weighed, on average, 70 kg. For IFX maintenance, we derived a cost of £1966.41 (assuming four 100-mg vials at £419.62 plus administration costs of £287.93 per infusion) every 8 weeks. For patients switching to ADA, we derived a cost of £704.28 (2 × £352.14, assuming 40 mg of ADA is required every 2 weeks) per 4-week cycle. We assumed that patients would self-administer ADA; hence, no administration costs were included.
The estimated costs of management (outpatient visits to consultants and further investigations) associated with occupying all health states except the dead state were obtained from NHS Reference Costs 2013 to 2014167 and in consultation with a clinical expert. These health-state costs include outpatient visits, colonoscopy and magnetic resonance imaging. In Table 36, we present the unit costs per year associated with each health state.
Costs obtained from published sources were adjusted to 2013/14 prices using the Hospital and Community Health Service Pay and Price Index170 and future costs were discounted at a rate of 3.5% per annum, as recommended by NICE.
Outcomes
The outcome measure used in our analyses was the number of QALYs gained. To calculate the estimated QALYs associated with the health states described in the model, we obtained utility weights from published literature157 reported in our review of cost-effectiveness, and combined these utility values with data on life expectancy from the Office for National Statistics.169 Utility values reported in Velayos et al.157 were obtained from the study undertaken by Gregor et al.,20 who compared various elicitation techniques (standard gamble, time trade-off and visual analogue scale) in 180 consecutive CD patients. These authors suggested that the standard gamble technique reflected the true value for health states related to patients with CD, and these values may be the most appropriate for an economic analysis. Table 36 shows the utility weights used in the model. In each cycle of the model, patients will incur a utility pay-off depending on the health state being occupied. In the model, we applied a utility weight of 0.77 for individuals categorised as responders or as having regained response. For those considered to have lost response, we assigned a utility value of 0.62. Those who had undergone a surgical procedure and who remained in the post-surgery health state were assigned a utility weight of 0.86.
Analysis
The model was constructed to assess the cost-effectiveness of concurrent testing, reflex testing and no testing of blood levels of anti-TNF-α agents and of antibodies to these agents in patients with severe CD. The model estimated the mean costs and effects associated with each testing strategy, and was simulated over a 10-year time horizon with 4-weekly cycle lengths. The starting point for the responder population was a hypothetical cohort of patients aged 30 years whose disease responds to a maintenance course of TNF-α inhibitor therapy. This age was chosen because the onset for CD is likely to occur from the late teens to age 30 years.171 We define a maintenance course as 5 mg/kg intravenous IFX every 8 weeks. The analysis was undertaken from a NHS perspective in an outpatient care setting, and outcomes were reported as ICERs, expressed in terms of cost per cost per QALY gained.
Sensitivity analysis
In addition to our base-case analysis, we have undertaken a number of sensitivity analyses. These analyses are summarised below:
- undertake concurrent testing and reflex testing every 12 months in the responder and LOR models
- estimate the mean costs and effects associated with each strategy using a 1-year time horizon with 4-week cycle lengths
- in the responders model – three possible modes of one-off testing:
- one-off testing at 3 months followed by yearly retesting
- one-off testing at 3 months and one retest for those who regained response
- one-off testing at 3 months and no retesting for responders/regained response.
- in the LOR model – 3-monthly testing for patients with LOR; no testing for patients who have regained response
- no regain of response following best supportive care (responders)
- no regain of response following best supportive care (LOR).
Probabilistic sensitivity analyses
Probabilistic sensitivity analyses were undertaken to determine the joint uncertainty in key model input parameters of test results and expected QALYs. The PSA was undertaken based on the outcome of cost per QALY only. In PSA, each model parameter is assigned a distribution reflecting the amount and pattern of its variation, and cost-effectiveness results are calculated by simultaneously selecting random values from each distribution. The distributions used in the PSA are presented in Table 36. We have calculated probabilities that each strategy is the most cost-effective, at a willingness to pay of £20,000/QALY.
Results of base-case analyses and sensitivity analyses
Here we present the results of the base-case analyses based on the simplifying assumptions made in the model. In the base case, using a hypothetical cohort of adults aged 30 years with severe CD, the results of concurrent testing, reflex testing and no testing (standard practice), in terms of QALYs gained, are presented in Table 37. At the 10-year time horizon, in the standard practice cohort, reflex testing resulted in a mean gain of 6.2761 QALYs, with a corresponding mean cost of £138,700. The concurrent testing cohort gained 6.2637 QALYs, with a mean cost of £139,800. The no-testing cohort gained 6.5084 QALYs, with a mean cost of £150,500. These results show that the reflex testing strategy was less costly and produced more QALYs than the concurrent testing strategy, hence dominating the concurrent testing. The no-testing strategy was the most costly and effective strategy with an ICER of approximately £50,800 per QALY.
Table 38 presents the results of the analyses based on an outcome of cost per QALY in the LOR model with testing (concurrent and reflex) undertaken every 3 months. The results show that at the 10-year time horizon the concurrent testing strategy resulted in 6.1807 QALYs, with a corresponding mean cost of approximately £129,400. Reflex testing produced marginally more QALYs at an incremental cost of approximately £94,700 per QALY. The no-testing strategy has a mean cost of approximately £215,800 and costs approximately £84,800 more than reflex testing, with a total effectiveness of 6.4961 QALYs. This result indicates that, in this LOR model, the no-testing strategy is less cost-effective than either reflex or concurrent testing. (Each additional QALY gained by adopting the no-testing strategy compared with reflex testing costs £284,100 in a cohort of patients with LOR.)
Results of sensitivity analyses
We undertook a number of one-way sensitivity analyses to determine the impact on the results of changing key model input parameters (Table 39).
First, in the responder model, we changed the testing strategy from 3 months to annual testing. The results showed that concurrent testing was the cheapest strategy, with a mean cost of approximately £114,000 and generating 6.2201 QALYs. In the reflex testing arm, this strategy was marginally more expensive and provided more QALYs, with an ICER of approximately £12,500 per QALY. As expected, the mean cost and effectiveness of the no-testing strategy remained unchanged. A no-testing strategy compared with a reflex testing strategy had a reported ICER of £129,900 per QALY.
Second, changing the 3-month testing to annual testing in the LOR model resulted in both concurrent and reflex testing being cheaper than the no-testing strategy.
Third, on changing the model time horizon from 10 years to 1 year with 3-month cycles, we found that the no-testing strategy dominated both testing strategies. In the LOR model, the no-testing strategy was the most expensive and most effective strategy, with a mean cost of approximately £23,500 and corresponding QALYs of 0.7560.
Finally, changing the testing regime in the responder model to one-off testing at 3 months followed by yearly testing (for those responding to treatment), at 3 months followed by one retest for those who regained response, and at 3 months and no retesting for responders or those who regained response showed that no testing was more expensive than testing and was more effective. Similar results were shown obtained for the one-off testing in the LOR model, and assuming that patients could not regain response following best supportive care.
In further sensitivity analyses, we varied key model input parameters to determine which inputs influence the ICER. Figures 31 and 32 show the percentage change in the cost per QALY as a result of increasing or decreasing these inputs by 10% of the base-case value. The results showed that the model is stable to most of these changes, but is sensitive to a 10% increase in the utility value for patients who regain response in both reflex and concurrent testing.
Results of probabilistic sensitivity analysis and cost-effectiveness acceptability curves
Figure 33 shows the Monte Carlo simulation for the responder model. The scatterplot illustrates the uncertainty in the expected costs and QALYs based on concurrent and reflex testing compared with no testing. Scatterplots of the 10,000 runs of the Monte Carlo simulations show considerable uncertainty around additional expected costs and QALYs.
The results for the responder model are presented in the form of cost-effectiveness acceptability curves in Figure 34. Cost-effectiveness acceptability curves give the probability that a strategy is cost-effective at various values of willingness to pay for a QALY. The willingness-to-pay threshold used by NICE is between £20,000 and £30,000 per QALY. From the information and assumptions used in the model, the results in Figure 34 show that, at £20,000 per QALY, the no-testing strategy is 92% likely to be cost-effective compared with concurrent and reflex testing.
Summary of cost-effectiveness
In summary, a de novo Markov model was built in TreeAge Pro 2013 to evaluate the cost-effectiveness of test algorithm-based treatment strategies compared with standard care. Two test strategies were assessed: concurrent testing of drugs and of antibodies to the drugs, and sequential or reflex testing (i.e. a drug test first, and then an anti-drug antibody test depending on the results of the drug test). The model structure was informed by studies from the clinical effectiveness review, additional published studies and analysis, and expert clinical advice. The model had a 4-week cycle and a 10-year time horizon and adopted NHS and PSS perspectives. Costs were adjusted to 2013/14 prices and annually discounted at 3.5%. The starting point was a hypothetical cohort of patients aged 30 years. Outcomes are reported as ICERs, expressed in terms of cost per QALY gained. A linked evidence approach was necessary. In this approach, evidence from studies using tests other than the designated intervention tests was employed as a proxy for intervention test evidence. A number of sensitivity analyses were undertaken, including a shortened 1-year time horizon with 4-week cycle lengths, altered transition probabilities for LOR, altering the proportions of patients in the different testing results categories and an arbitrary 10% change in the main input parameters. PSA was also undertaken (10,000 model runs).
Two management studies, both RCTs of reasonable quality, have used treatment algorithms similar to those suggested in the NICE scope. The economic modelling has been built around the algorithms used in these studies. Expert opinion was sought regarding the complex patient pathways followed by patients with CD and the treatment pathways dictated by the algorithms. Populating the model with information from the two management studies was problematic because the studies were of small size and short duration, and reported outcomes that were not directly relevant to an economic model; in addition, one study lacked an appropriate standard care arm for economic modelling and neither reported outcomes according to testing results. Many external sources of data were required to populate the model and refining data inputs from these sources is currently still in progress.
Base-case deterministic and probabilistic model results and sensitivity analysis results have been presented. The results require scrutiny using further investigations for model data inputs and sensitivity analyses, particularly with regard to frequency of testing, so as to test their robustness and to identify the main drivers of the ICER. However, we conclude that QALY gains are likely to very similar in both arms (concurrent/reflex) whereas the cost of the testing strategy (concurrent/reflex) appears to be more than twice the cost of standard care.
Publication Details
Copyright
Included under terms of UK Non-commercial Government License.
Publisher
NIHR Journals Library, Southampton (UK)
NLM Citation
Freeman K, Connock M, Auguste P, et al. Clinical effectiveness and cost-effectiveness of use of therapeutic monitoring of tumour necrosis factor alpha (TNF-α) inhibitors [LISA-TRACKER® enzyme-linked immunosorbent assay (ELISA) kits, TNF-α-Blocker ELISA kits and Promonitor® ELISA kits] versus standard care in patients with Crohn’s disease: systematic reviews and economic modelling. Southampton (UK): NIHR Journals Library; 2016 Nov. (Health Technology Assessment, No. 20.83.) Chapter 4, Cost-effectiveness review and health economic modelling.