NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Tikhonova IA, Yang H, Bello S, et al. Enzyme-linked immunosorbent assays for monitoring TNF-alpha inhibitors and antibody levels in people with rheumatoid arthritis: a systematic review and economic evaluation. Southampton (UK): NIHR Journals Library; 2021 Feb. (Health Technology Assessment, No. 25.8.)
Enzyme-linked immunosorbent assays for monitoring TNF-alpha inhibitors and antibody levels in people with rheumatoid arthritis: a systematic review and economic evaluation.
Show detailsThe assessment of whether the economic analysis conducted by the EAG meets the NICE Reference Case requirements is summarised in Table 81, Appendix 25.
Methods
Summary of available evidence
The treatments and ELISA kits that were used in the studies that were included in the clinical effectiveness systematic review are shown in Table 22.
The clinical evidence identified in the systematic review was limited:
- No studies related to IDKmonitor, LISA-TRACKER and RIDASCREEN test kits were identified.
- No studies investigating the use of TDM in RA patients treated with two drugs from the NICE scope, the TNF-α inhibitors GLM and CTZ, were found.
In addition, it was not clear whether originator products or their biosimilars were used in the selected studies, and the type of testing, concurrent or reflex, was not reported. Furthermore, in Pascual-Salcedo et al.44 it was not clear which type of test kits (MabTrack or those developed by Sanquin) were used by Sanquin Diagnostic Services to measure drug and antibody levels (see Table 22). Finally, no studies on TNF-α testing in primary and secondary non-responders were found.
Two studies included in the review, the non-randomised controlled trial INGEBIO42,43,45 and a historical control study reported by Pascual-Salcedo et al.,44 considered people in remission or with LDA at baseline. The study populations were mixed, with the proportion of participants with RA being only 37% and 49%, respectively, in the INGEBIO study and the study by Pascual-Salcedo et al.44 The patient population considered in the latter was relatively small (43 patients), whereas the former was a larger study with 169 study participants.
The only head-to-head trial identified in the review, the INGEBIO study, investigated the clinical and economic effects of TDM in patients treated with ADL. In this study, physicians were not obliged to follow any specific test-based therapeutic algorithm but could use testing to alter the treatment dose in participants from the intervention arm. The longest average follow-up, 530.8 days and 544.6 days in the intervention and control arms, respectively, was reported by Arango et al.43 Some of the aggregate clinical outcomes from the INGEBIO study are shown in Table 23.
Search for additional clinical effectiveness evidence
Studies that were identified by the searches conducted for the clinical effectiveness systematic review but not considered eligible for inclusion (e.g. studies reporting correlations between drug/antibody levels and therapeutic outcomes, and/or studies reporting drug/antibody levels before and after dose reductions only) were used to inform the model where appropriate.
Owing to the lack of RCT evidence on the effectiveness of the tests that are defined within the NICE scope,48 an additional systematic literature review was conducted to identify RCTs that evaluated any tests used to monitor TNF-α inhibitor treatment in people with RA. The aim of this search was to identify any evidence on the effectiveness of any strategies of treatment monitoring that could be used to inform scenario analyses for modelling.
Searches were carried out in MEDLINE, MEDLINE In-Process, EMBASE, the Cochrane Library and Web of Science. Searches were limited to RCTs and were carried out in October 2018. The search strategy and inclusion criteria are provided in Appendix 9. A total of 1418 hits were identified and were independently screened by two reviewers using the inclusion criteria shown in Appendix 9, Table 54. No relevant papers were identified.
Economic analyses
The outcomes from the only head-to-head trial that was identified in the systematic review, the INGEBIO study, were utilised in all analyses reported here. In the INGEBIO study, both mean time in remission and mean time in remission or LDA were estimated in patients from the intervention and control arms. Therefore, two separate scenario analyses (scenario 1 and scenario 2) based on alternative health state descriptions were conducted: the health states considered in scenario 1 were ‘remission’ and ‘LDA/active disease’ and the health states modelled in scenario 2 were ‘remission/LDA’ and ‘active disease’. The duration of the complementary health states (‘LDA/active disease’ in scenario 1 and ‘active disease’ in scenario 2) was estimated using the duration of follow-up.
Modelling approach
The choice of the modelling approach was primarily driven by the availability and quality of the evidence that was identified in the clinical effectiveness systematic review; other factors included the multifactorial nature of decisions to adjust treatments in people with RA34 and the recent changes in the biologics market, which contributed to the uncertainty in the prices of the TNF-α inhibitors and their uptake in the UK.
The biologics market is likely to increase in complexity over the coming months and years as more originator biological medicines lose patent exclusivity and additional biosimilar medicines come to the market.61 The patent for ADL (Humira) expired on 16 October 2018 (when this assessment was carried out). New medications with similar active properties (‘biosimilar’ versions) are likely to become available in the NHS at the end of 2018 (see Table 2). The following ADL biosimilars have already been approved for use in the UK but have not yet launched (as of 30 November 2018): Amgevita, Hulio, Hyrimoz and Imraldi. According to the Regional Medicines Optimisation Committee Briefing,62 at least two further biosimilars are expected to become available in the UK during 2019: Cyltezo (from Boehringer Ingelheim) and another will be brought to the market by Fresenius Kabi (Bad Homburg, Germany).
The NHS has established a working group to provide an oversight of implementing the use of best-value ADL using a commissioning framework that was launched in September 2017.63 The framework, authored by the NHS’s Medicines, Diagnostics, Personalised Medicine Policy Team, proposes that ‘at least 90% of new patients be prescribed the best value biological medicine’ within 3 months of the launch of a biosimilar for a given reference product, and that 80% of existing patients be prescribed the ‘best value medicine within 12 months of a biosimilar launch’.63
With regard to the current uptake of biosimilars in the UK, according to the Medicines Optimisation Dashboard data published by NHS England (September 2018 release),64 92% of people prescribed IFX and 85% of those prescribed ETN are taking biosimilars. However, there are regional variations in the uptake of biosimilars.64 In the Royal Devon & Exeter NHS Foundation Trust, people with RA who have been prescribed IFX or ETN are usually given their biosimilars, whereas biosimilars for ADL have become available only recently; patients prescribed GLM or CTZ are treated mostly with the originator products (Dr Haigh, Royal Devon and Exeter NHS Foundation Trust, Exeter, November 2018, personal communication). In the Greater Manchester area, the biosimilar Amgevita is soon to be used for patients who are prescribed ADL; patients who are prescribed IFX are usually given its biosimilars, Inflectra or Remsima; and a biosimilar Benepali is used in some patients who are prescribed ETN (Dr Jani, University of Manchester, personal communication, November 2018).
Although the NICE guidance23 recommends that people with RA receive treatment with the TNF-α inhibitor with the lowest acquisition and administration costs, in practice other non-cost factors, such as patient and hospital characteristics, and changes in regional rheumatology clinical guidelines, may influence the choice of treatment.17
Analyses conducted
Threshold and cost–utility analyses based on a decision tree model (described in Model structure) were conducted to estimate the economic outcomes of adding TDM to SOC for RA patients who were treated with TNF-α inhibitors.
Threshold analyses
In the threshold analyses, the cost of TNF-α testing at which adding TDM to SOC would result in zero net monetary benefit (NMB) was estimated, as described below.
The NMB represents the value of an intervention in monetary terms when a willingness-to-pay (WTP) threshold for a unit of benefit (e.g. QALY) is known. It is estimated by first assuming a WTP threshold (e.g. £20,000 or £30,000 per QALY gained) and then calculating the NMB as follows:
where incremental costs and incremental benefits represent incremental costs and QALYs for the health technologies under consideration.
In this study, NMB was estimated for a range of acquisition costs of ADL (from £1000 to £9187 per patient-year) at the WTP thresholds of £20,000 and £30,000 per QALY gained, which are the thresholds usually considered by NICE. In the threshold analyses, the costs of drug acquisition and administration and the costs associated with disease management were included; the latter comprised the costs of managing flares and AEs, and the costs of managing different health states. QALYs were estimated from the rates of flares and AEs, and the average duration of remission and LDA/active disease health states (for the analysis based on data from Ucar et al.42) or remission/LDA and active disease health states (for the analysis based on data from Arango et al.43) in patients from the intervention and control arms.
In the threshold analyses, the cost of TNF-α testing per patient-year, under which the test-based treatment strategy has zero NMB, was estimated in the following way:
where the total cost of testing comprises the costs associated with testing patient blood samples to monitor trough drug and antibody levels. The ICER threshold represents the NICE WTP of £20,000 or £30,000 per QALY gained, and Δcosts and ΔQALYs are incremental costs and QALYs across the intervention and control arms.
The costs incurred in each arm were estimated as follows:
For scenario 1 (with ‘remission’ and ‘LDA/active disease’ health states), QALYs were derived in the following way:
For scenario 2, QALYs were estimated from the duration of ‘remission/LDA’ and ‘active disease’ health states and their corresponding utilities.
Cost-effectiveness analyses
In addition to the threshold analyses, cost–utility analyses were conducted in which ICERs were estimated using the list prices of biologics and the cost of TNF-α testing; the latter was based on the prices of the Promonitor test kits (provided by Grifols–Progenika), and the other costs associated with TNF-α testing37 and clinical advice.
Model structure
A diagram of the decision tree model that was used in the threshold and cost–utility analyses is presented in Figure 4.
As shown in Table 23, approximately one-third of patients in the treatment and comparator arms of the INGEBIO study had their ADL dose tapered; flares were observed in patients from both arms. The effect of flares on costs and QALYs was modelled following Gavan.17 Figure 5 (adapted from Gavan17) illustrates the cost and QALY profile depending on whether or not the dose is tapered. The figure shows changes in the acquisition cost and QALYs due to flares over time. Note that, for the sake of clarity, the other components of the total costs and QALYs considered in our analyses are not depicted here.
As shown in Figure 5, all patients had their drug levels tested at t0, resulting in tapering of the dose in some patients (see Figure 5a). It was assumed that a proportion (p) of patients on tapered doses experienced flare at t1, prompting treatment to revert to the original dose, whereas in the remainder (1 – p) the dose remained the same (i.e. tapered). In those patients who flared, the disutility of flare (qf) was applied for the duration of flare (t2– t1) (see Figure 5a). In non-tapered patients (see Figure 5b), the acquisition cost was based on the cost of the full dose; it was assumed that non-tapered patients do not experience flares.
In clinical practice, flares have been observed in patients receiving full and reduced doses of the biologics, with an increased risk of flares in tapered patients.65 In the economic analysis, however, the occurrence of flares was modelled in all patients regardless of their treatment dose (see Figure 5a) given that the flare rates reported in the INGEBIO study were not stratified by dose.
The estimates of the mean time to the first flare were used to model the time when the dose in flared patients was restored to the full dose indefinitely (which affected the drug acquisition costs and wastage), whereas the flare rates were used to estimate the cost of flare management and the reduction in QALYs due to occurrence of flares in the treatment and control arms. It was also assumed that flares could occur in any health state.66
Population
The modelled population comprised patients in remission or LDA. The baseline characteristics of participants in the INGEBIO study are presented in Table 24 along with the characteristics of RA patients from the BSRBR-RA database67 who responded to biological treatment.
As shown in Table 24, patients in the INGEBIO study were slightly younger, on average, than patients from the BSRBR-RA database, and were considerably less likely to be female.
Subgroups
People with RA can be grouped according to three clinical scenarios: primary non-response, secondary non-response and remission. However, with regard to particular characteristics, there are no subgroups for which the clinical effectiveness of TDM is expected to significantly vary; therefore, no subgroup analyses were considered in this assessment.
Interventions and comparators
Owing to the paucity of data, not all test kits specified in the NICE scope could be evaluated in this study. In particular, no economic analyses relevant to IDKmonitor, LISA-TRACKER, RIDASCREEN, MabTrack ELISA kits and those used by Sanquin Diagnostic Services were conducted. The only test kits considered were Promonitor assays for measuring trough ADL and antibody levels (see Table 22).
The comparator was SOC, in which treatment decisions were based on clinical judgements and other measures (such as DAS28), that is without the use of TDM.
Perspective, time horizon and discounting
The costs and resource use were considered from the perspective of the NHS and Personal Social Services.69 Cost and health outcomes were not extrapolated into the future because the lack of long-term evidence means that external validation of extrapolated outcomes would not be feasible; therefore, no discounting was applied to estimated costs and QALYs.
The time horizon was defined by the observational period in the INGEBIO study, namely 505 days and 544.6 days for the analyses based on Ucar et al.42 and Arango et al.,43 respectively. The comparator arm, as reported in Ucar et al.42 and Arango et al.,43 had slightly longer follow-ups; therefore, the mean duration of follow-up in patients from the comparator arm was used as the time horizon in the economic analyses based on these sources. The estimates of the mean duration of remission (scenario 1) and remission/LDA (scenario 2) for the intervention arm were not adjusted to account for such a difference given that the Kaplan–Meier estimates for time in remission were not available to the EAG; therefore, it is possible that the cost-effectiveness of the intervention under consideration was underestimated. However, owing to a small difference (of about 1–2%) in the length of follow-up periods between the treatment and the comparator arm, this simplifying assumption is likely to have only a small impact on the results.
Considerations in the development of the independent economic assessment
Flares
The concept of flare remains challenging to understand, as there are no generally recognised definitions of or well-validated measures for flare in RA.70 Nevertheless, patients, clinicians and scientists commonly resort to this term to refer to episodes of worsening disease activity, which includes a range of symptoms of different duration and magnitude.71
Several different RA flare criteria have been used in clinical research. For instance, van der Maas et al.47 identified six previously published DAS28-based flare criteria, and Markusse et al.72 reported three criteria (Table 25).
Smolen et al.66 compared RA patients treated with ETN recruited in the PRESERVE trial who did or did not have flares. In this trial, a disease flare was defined as either loss of LDA, with or without a change in DAS28 of 0.6, or relapse (DAS28 of > 5.1 or DAS28 of > 3.2 at two or more consecutive time points).
In the INGEBIO study, a flare was defined as an increase in DAS28 of > 1.2 or an increase in DAS28 of > 0.6 if the current DAS28 was ≥ 3.2.
Duration of flare
Substantial heterogeneity in the duration of flare has been reported70 and observed in clinical practice. A flare may last from 2–3 days up to 2–3 months, depending on severity (Dr Jani, personal communication). The duration of flare was estimated in the dynamic cohort in the Brigham Rheumatoid Arthritis Sequential Study (BRASS),70 which included 1105 people with established RA who had received usual care at the Brigham and Women’s Hospital in Boston70 (Table 26).
The estimate of 7 days was adopted in the primary analyses based on Ucar et al.42 and Arango et al.;43 this was consistent with the estimate used in NICE TA375.23 The impact on the results of a longer duration of flare, 19 days, was evaluated in scenario analyses; this represents a weighted average of the estimates reported in the BRASS study70 (see Table 26) and those provided by Dr Jani (personal communication).
Time to the first flare
Arango et al.43 and Ucar et al.42 reported the median time to the first flare that was observed in the intervention and control arms of the INGEBIO study; however, according to the NICE Guide to the Methods of Technology Appraisal,69 mean estimates should be utilised in economic analyses of health interventions. The mean time to the first flare in the intervention and control arms was calculated from Kaplan–Meier curves for the time to the first flare in the INGEBIO study, sourced from a poster presentation by Ucar and colleagues at the Annual European Congress of Rheumatology EULAR 2017 (Ucar and Osakidetza, personal communication, September 2018), by using the area under the curve approach.
The Kaplan–Meier estimates were available for 300 days (see Figure 8, Appendix 10) and were extrapolated for the duration of follow-up reported in Ucar et al.42 and Arango et al.43 (see Table 23). Given that the proportion of participants who were on a tapered dose in the intervention and control arms levelled at around 240 days after dose tapering, it was assumed that these proportions remained the same until the end of the observational periods in the INGEBIO study and, therefore, no parametric model fitting was performed. The estimated mean time to the first flare was 208.07 days and 189.32 days in the intervention and control arms, respectively. These values were used in both scenario 1 (based on Ucar et al.42) and scenario 2 (based on Arango et al.43).
Flare rate
Treatment arm-specific flare rates (per patient-year) were reported in both Ucar et al.42 and Arango et al.43 (see Table 23), and were the same in both sources despite the fact that the abstracts reported outcomes for different follow-up periods. These estimates were utilised in the primary and exploratory analyses.
Serious adverse events
When modelling the effect of AEs on patient’s HRQoL and costs, the approach used in TA37523 was adopted: it was assumed that only serious adverse events (SAEs) (serious infections in particular) would carry a significant cost and disutility burden.23 This assumption was considered appropriate by the EAG’s clinical advisors.
Rate of serious adverse events
One study from the clinical effectiveness systematic review, Senabre Gallego et al.,73 reported the rate of AEs experienced by patients who were treated with TNF-α inhibitor therapies. This study recruited 39 participants with RA who had achieved remission. The findings showed that one participant (3%) had septic arthritis (serious infectious arthritis) that was associated with TNF-α inhibitor therapies (ADL or ETN) in the 1-year follow-up period (see Table 55, Appendix 11).
Given that the evidence on SAEs in the population of interest was limited, additional searches were conducted. Lahiri and Dixon74 indicated that there was a time-dependent increase in the risk of serious infections in people with RA who were treated with biologics, with the maximum risk in the first 6 months of biological therapy and a gradual decline thereafter. The authors argued that this time-dependent decrease in the risk of serious infections can be attributed both to ‘depletion of susceptibles’ (i.e. high-risk participants dropping out of the TNF-α inhibitor cohort because of death, stopping therapy or loss to follow-up), which accounted for two-thirds of the observed difference, and to reduction in the inherent infection risk resulting from an improvement in patient’s functional status and a decrease in the dose of glucocorticoid.
According to Bruce et al.,75 the risk of Pneumocystis jirovecii pneumonia in people from the BSRBR-RA register who were treated with TNF-α inhibitors was low, with an incidence rate of 2 (95% CI 1.2 to 3.3) events per 10,000 person-years of follow-up (see Table 55, Appendix 11); the rate of tuberculosis was higher among those treated with ADL (144 events per 100,000 person-years) and IFX (136 events per 100 000 person-years) than among those treated with ETN (39 events per 100,000 person-years) (see Table 55 and Dixon et al.76).
The rate of SAEs reported in Burmester et al.77 was 4.7 per 100 patient-years (see Table 55, Appendix 11). This estimate was derived from 15,132 people with RA who were exposed to ADL in 28 global clinical trials. A SAE was defined as a fatal or immediately life-threatening event; an event necessitating hospitalisation or prolonging hospitalisation; an event resulting in persistent or significant disability/incapacity or congenital anomaly; or an event necessitating medical or surgical intervention to prevent a serious outcome. At baseline, participants considered in this study had a mean age of 53.5 years and a mean disease duration of 9.1 years; 78.8% were female, 16.5% had a treatment duration > 2 years and 10.9% had a treatment duration > 5 years.
The rate of serious infection that was adopted in TA375,23 35 out of 1000 patients, was based on Singh et al.78 and was assumed to be independent of the bDMARDs used (i.e. all biological therapies were assumed to have similar safety profiles).
Consultation with clinical advisors confirmed that serious infections in people with RA from the population of interest are relatively rare.
In this study, the modelled AE rate for people who were receiving a full dose of biologics, three events per 100 patient-years, was adopted from Senabre Gallego et al.73 The AE rate in tapered patients was estimated using an odds ratio (OR) for serious infections in people who were treated with low-dose biologics compared with people who were receiving the standard dose79 (see Appendix 12). The resulting AE rate in tapered patients was two events per 100 patient-years.
Duration of serious adverse events
In TA375,23 a serious infection in RA patients was assumed to persist for 28 days, on average. This estimate was adopted in all analyses reported here.
Model parameters
The major model assumptions in the primary analyses based on Ucar et al.42 (scenario 1) and Arango et al.43 (scenario 2) were as follows:
- Adalimumab dose tapering is implemented by increasing the interval between doses from 2 to 3 weeks.
- Dose is tapered in a proportion of people in each arm at the start of simulation.
- The full dose of ADL is restored indefinitely in all people who are on tapered doses when they experience the first flare.
The model assumptions for the primary analyses are shown in Table 27.
Utilities for the mixed-disease population in the INGEBIO study were assumed to be the same as those for the population of people with RA since no evidence on HRQoL directly relevant to the population considered in INGEBIO has been identified. Mortality associated with RA was not modelled because of the short-term time horizon of approximately 18 months adopted in this study.
Resources and costs
Costs considered in the economic evaluation included the cost of testing, and treatment and health-care costs. Unit costs were obtained from the BNF,21 NHS Reference Costs,84 documents provided by test manufacturers and published and unpublished sources.
Parameters specific to the threshold analyses
Given that the patent for the ADL originator product (Humira) expired in October 2018 and the true costs of the ADL biosimilars to the NHS were not known to the EAG at the time of writing, in the threshold analyses the annual acquisition cost was varied from £1000 to £9187 per patient-year. The latter represents the annual cost of ADL (Humira), assuming a dose of 40 mg every 2 weeks delivered by subcutaneous injection using a prefilled pen and the NHS indicative price from the BNF21 (Table 28).
Parameters specific to the cost-effectiveness analyses
In the cost–utility analyses, ICERs were estimated using the list prices of the TNF-α inhibitors (in accordance with NICE guidelines69) and the costs of testing based on the costs of Promonitor assays (provided by Grifols–Progenika), other testing costs outlined in the study conducted by Jani et al.37 and clinical advice (see Table 28 and Cost of testing for more details).
The primary analyses were conducted for the list price of the ADL originator product, Humira®, whereas, in the exploratory analyses for other TNF-α inhibitors described in Exploratory analyses: etanercept or infliximab and Promonitor, the list prices for the ETN originator product (Enbrel®) and its biosimilar Erelzi® and IFX biosimilars Flixabi® or Renflexis® were utilised.
Conversion to Great British pounds
Where conversion from other currencies to Great British pounds (GBP) was required, International Monetary Fund purchasing power parity (PPP) was used to convert within the year (e.g. from 2001 euro to 2001 GBP), after which inflation was applied. The Campbell and Cochrane Economic Methods Group–EPPI-Centre (Evidence for Policy and Practice Information and Coordinating Centre) Cost Converter was used for the PPP conversion.86
Inflation to 2017–18 prices
Unit costs were inflated to 2017–18 prices by inflating to 2015–16 prices using the Hospital and Community Health Services (HCHS) pay and prices index,81 and then to 2017–18 prices using the average increase in the index for the previous 3 years (from 2013–14 to 2015–16), with the average rate of 1.1% per annum (see Appendix 14).
Treatment costs
Drug acquisition
Annual acquisition costs of the TNF-α inhibitors from the NICE scope48 estimated using their list prices and assuming adherence to standard dosing regimen for each drug are shown in Table 28.
The estimated costs of treatment with ADL, ETN, GLM and CTZ were based on the price of the solution for injection in prefilled pens given that these biologics are administered subcutaneously and can be self-administered. Consultation with clinical experts confirmed that all of the TNF-α inhibitors considered in this study, except IFX, are usually self-administered by people with RA at home.
Consistent with acquisition cost calculations in TA375,23 the cost per annum of IFX was estimated using a patient average weight of 70 kg. IFX is administered intravenously (the cost of intravenous administration is described in Drug administration).
As reported in TA375,23 the manufacturers of GLM provided the 100-mg dose at the same price as the 50-mg dose under a patient access scheme arrangement. This discount would not affect the annual cost presented in Table 28, as that is based on the assumption that the average patient weight is < 100 kg.
The acquisition costs of the cheapest available pens for each drug are equivalent to the cost of the cheapest available dose. Therefore, the annual acquisition costs for the self-administration route are equivalent to those for biologics administered during outpatient visits.
Of note, the estimates for the additional acquisition costs for the first year (see Table 28) are presented for information only. They were not used in any analyses given that the population in this assessment are people experienced in biologics.
Dose tapering
According to EULAR recommendations for the management of RA with sDMARDs and bDMARDs,66 tapering of biologics should be considered in people who are in persistent remission after having tapered glucocorticoids, especially if this treatment is combined with a conventional sDMARD. In this context, tapering means reduction of the dose, for example reducing ETN from 50 mg/week to 25 mg/week,87 or increasing the interval between applications (‘spacing’), for example increasing the interval between ADL injections from 1 week to 10 days, as in the Exeter Biologic Clinic recommendations (see Appendix 13).
The EAG is aware that there is no gold standard on how dose tapering should be carried out. Studies evaluating dose tapering have used different approaches. In clinical practice, dose tapering varies extensively depending on the clinical opinion; for example, according to the Exeter Biologic Clinic recommendations (see Appendix 13), when tapering the ADL dose the dose should be reduced by one-third to 40 mg every 3 weeks and reduced further at 3 months to 40 mg every 4 weeks in people with LDA or remission. However, it may not be a representative strategy because of variations in clinical practice.
In the primary analyses, the assumption of reducing the dose by one-third (the first dose reduction in the Exeter Biologic Clinic recommendations; see Appendix 13) was implemented (see Table 27), while the assumption of halving the dose (the second dose reduction described in Appendix 13) was explored in sensitivity analyses (see Table 39).
Treatment wastage
The dose-tapering strategy suggested in the Exeter Biologic Clinic recommendations (see Appendix 13) is spacing; therefore, when this tapering strategy is used, there is no wastage of the self-administered drugs resulting from partial use of the dose in the prefilled injection pens. Clinical advice indicated that wastage of IFX owing to partial use of vials is usually avoided (Dr Haigh, Royal Devon & Exeter NHS Foundation Trust, November 2018, personal communication).
In the primary analyses, however, wastage of £370 per patient-year was incorporated (see Table 27). This estimate was based on a survey conducted at the Royal Devon & Exeter NHS Foundation Trust (Dr Haigh, Royal Devon and Exeter NHS Foundation Trust, Exeter, December 2018, personal communication) and was derived from data on 119 people with RA who were treated with biologics, and included missed doses and oversupply (defined as delivery of treatment even if > 4 weeks’ supply was available at home). It was assumed that, on average, £370 per patient-year would be wasted in people who were on the full dose of ADL, whereas in people who were on tapered doses wastage would be reduced in proportion to the reduction in treatment dose. In scenario analyses considering other biologics (see Exploratory analyses: etanercept or infliximab and Promonitor), the treatment wastage was also assumed to be proportional to the drug acquisition cost. The effect on the outcome of the no-wastage assumption was explored in sensitivity analyses (see Table 39).
Drug administration
As stated above, ADL, ETN, GLM and CTZ are usually self-administered via subcutaneous injection using a prefilled pen. In this scenario, there is no administration cost for delivery. Alternatively, these drugs may be administered by a district nurse. The average administration cost that was assumed in TA37523 (which was based on an estimate reported in TA24788) was £2.61 (cost year 2012). Given that this cost is quite low and that self-administration of the drugs listed above is very common in clinical practice in England, the effect of the assumption that subcutaneous administration would be performed by a nurse was not evaluated.
The administration cost for IFX is considerably higher as it is administered intravenously over a 2-hour period. Patients may be pretreated with, for example, antihistamine, hydrocortisone and/or paracetamol, and the infusion rate may be slowed in order to decrease the risk of infusion-related reactions, especially if such reactions have occurred previously.89 Patients are observed for at least 1–2 hours post infusion for acute infusion-related reactions. Based on clinical advice, IFX is typically administered in outpatient settings.
In DG22,90 the administration cost for IFX was estimated to be £287.93 per infusion (2014 prices). In a more recent technology appraisal, TA329,91 the cost was estimated to be £297 per administration (2015 prices).91
Grant Smith (Specialist Pharmacist, Royal Devon & Exeter NHS Foundation Trust, December 2018, personal communication) advised us that in the Royal Devon & Exeter NHS Foundation Trust the cost of IFX administration is based on Healthcare Resource Groups (HRGs) for inflammatory bowel disease without interventions, with complications and comorbidities scores depending on patient type. The relevant HRGs from the NHS Reference Costs (2017–18)84 are shown in Table 29.
The weighted-average administration cost of £283 per administration (estimated across the unit costs for the HRG codes presented in Table 29) was adopted in scenario analyses considering people with RA who were treated with IFX (see Table 41).
Cost of testing
The costs of testing comprised the cost of the test kits, the staff time to perform a test, the cost of the testing service and the cost of sample transport. Based on the information provided by the companies and on clinical opinion, it was anticipated that minimal additional training would be required by health-care staff to use any of the testing kits that were considered in this assessment. Therefore, training costs were assumed to be negligible and were not considered in the model.
Dr McDonald advised us that laboratories that conduct TNF-α testing have previously negotiated arrangements with the manufacturers of bDMARDs to cover the cost of biological monitoring, including assays and personnel costs (Dr McDonald, Royal Devon & Exeter NHS Foundation Trust, Exeter, December 2018, personal communication). However, based on advice from Dr Jani (University of Manchester, November 2018, personal communication), that might vary by geographical area and may be relevant to certain biologics only (e.g. newer biosimilars).
Assay costs provided by the manufacturers
The cost of reflex and concurrent testing for each assay were derived from information request documents submitted by the manufacturers of the test kits (see Appendix 15, Table 59).
Processing costs
In addition to assay costs, the cost of testing also includes processing costs, such as administration and laboratory personnel time; these costs were reported by Jani et al.37 (see Appendix 16). In this study, the cost of concurrent testing of drug and antibody levels in patients who were treated with ADL and tested using Promonitor kits was estimated. The study was an audit of practice in north-west England in which the direct medical costs associated with providing the test were estimated from the NHS perspective. The costs were determined from the point of a patient who was established on treatment (for ≥ 3 months) presenting to a clinic, to the results being fed back to the clinician to inform a treatment decision.
Jani et al.37 assumed that during the pre-testing phase (see phase 1 in Appendix 16, Table 60), one outpatient appointment with a consultant rheumatologist is required to discuss the need for testing, followed by an appointment with a phlebotomist or a clinical support worker to obtain trough blood levels. This study reported that additional costs that were associated with laboratory personnel time processing samples would be incurred during the testing phase (see phase 2 in Appendix 16, Table 60). However, it was assumed that most hospital laboratories would have the necessary room requirements and would stock standard equipment that was needed to perform ELISA, and the following items of resource use were, therefore, excluded:
- equipment costs of centrifuge systems
- ELISA readers
- pipettes
- personal protective equipment
- phlebotomy equipment costs
- overhead costs
- capital costs.
In addition, the treatment decision stage (see phase 3 in Appendix 16, Table 60) would require interpretation of results by a consultant rheumatologist, discussion of the results with patients via a telephone call and, finally, a letter outlining the results and treatment decision.
The mean cost per patient per test reported in Jani et al.37 was £152.52 (2015 prices) if 40 samples were tested simultaneously; this included the cost of the test kits. The pre-testing phase incurred the highest costs, which were driven by the cost of a phlebotomy appointment to acquire trough blood samples, which constituted 67% of the total cost; labour accounted for 10% and consumables for 23% of the total cost.
Cost of sample transport
One of the minor cost components that was considered by Jani et al.37 was ‘transport, receipt and storage of sample’, which was £2.22 (2015 prices) per batch of 40 samples (see Table 60).
Blood samples are received at the Exeter Clinical Laboratory (Royal Devon & Exeter NHS Foundation Trust) as small parcels via Royal Mail (London, UK), and it is extremely unlikely that samples would be sent to Sanquin Diagnostic Services in the Netherlands, as the transportation cost would be higher than that within the UK; moreover, sending samples abroad would lead to a longer turnaround time and take expertise out of the NHS (Dr McDonald, December 2018, personal communication).
According to the Royal Mail,92 postage costs are £4 per parcel shipped within the UK and £10 per parcel shipped to Sanquin Diagnostic Services. Based on clinical advice, it was assumed that blood samples would be posted to a laboratory within the UK and, therefore, the postage of £4 per parcel was applied.
Frequency of testing
Rosas et al.93 reported the total number of drug and anti-drug antibody monitoring tests in RA patients who were in remission over a 2-year period (94 tests in 45 patients), which is approximately one test per patient per year. Dr Jani confirmed that in England TNF-α testing would be conducted once per year in people who are in remission/under routine follow-up; however, if tapering is performed based on drug level, a clinician would typically check the drug level at least every 6 months to ensure that the level has not dropped too low. Therefore, in the primary analyses, one TNF-α test per patient per year was assumed, while 6-monthly testing was modelled in sensitivity analyses (see Table 39).
Reflex versus concurrent testing
Dr McDonald (Exeter Clinical Laboratory, Royal Devon & Exeter NHS Foundation Trust November 2018, personal communication) advised that TNF-α testing for blood and antibody levels is usually carried out concurrently; blood samples that are sent to the Exeter Clinical Laboratory are kept frozen for 1 month, and the likelihood of performing antibody testing (‘reflex testing’) 1 month after testing the trough level is extremely low.
In this unlikely scenario in which reflex testing is performed, an additional phlebotomy appointment would not be required (assuming that storage of blood samples is a common practice at test laboratories). Hence, the cost difference between reflex and concurrent testing would be defined by the proportion of patients with low or undetectable drug levels (for whom antibody testing would be requested), and the cost of telephone calls to the laboratory to request antibody testing. To estimate the cost difference between reflex and concurrent testing, the proportion of people with low drug levels was derived from Chen et al.94 and Laine et al.54
The authors of the former study94 investigated the impact of ADL dose-halving on therapeutic responses and drug levels in people with RA. Trough serum ADL levels were determined at baseline and at week 24 of dose-halving therapy using a sandwich ELISA (Progenika Biopharma). The minimal detectable ADL level was 0.002 mg/ml. In this study, 3 out of 64 (4.7%) participants who developed ADL antibodies at week 24 of dose-halving had very low drug levels. In these participants, trough ADL levels markedly declined to very low levels (from 2.28 mg/ml, 1.92 mg/ml and 2.21 mg/ml at baseline to, respectively, 0.024 mg/ml, 0.024 mg/ml and 0.004 mg/ml at week 24 of dose-halving).
Laine et al.54 reported low drug levels (< 5 µg/ml) in 35.8% of people with RA who were treated with ADL from the clinical sample registry of United Medix Laboratories Ltd (Helsinki, Finland). All of the samples included in the database had been sent to the laboratory on a clinical basis (i.e. none of the samples was from clinical studies). Drug levels were measured by Sanquin Diagnostic Services.
However, there is no universal agreement of what to consider a low drug level in people with RA who are treated with biologics (Dr McDonald, personal communication). Therefore, estimates for the proportion of people with low drug levels of 4.7%94 and 35.8%54 were adopted as the lower and upper bounds in scenario analyses for reflex testing.
In Jani et al.,37 a telephone call to discuss a treatment decision with a patient was assumed to take, on average, 5.3 minutes at a cost of £3.47. Dr McDonald (personal communication) confirmed that this would also be a reasonable cost estimate for a telephone call to a laboratory to request additional testing on stored blood.
Single versus duplicate testing
The costs of carrying out ELISA using Promonitor kits are shown in Appendix 17, Table 61. The estimates were derived assuming single or duplicate, reflex or concurrent testing with or without a phlebotomy appointment.
Single testing incurs a lower cost than duplicate testing, but it is less precise. Therefore, duplicate testing was selected in the base-case analysis that was conducted by Jani et al;37 however, single testing is more common in the UK (Dr McDonald, personal communication). For this reason, this approach was adopted in the primary analyses and duplicate testing was modelled in scenarios.
In the primary analyses, the cost of concurrent testing using Promonitor test kits was calculated following Jani et al.,37 that is assuming that a phlebotomy appointment to collect a trough sample would be needed (see Table 61). Scenario analyses excluding this cost were also conducted.
Cost of managing different health states
Based on published literature, active disease in people with RA is more costly to manage than disease in people in remission or LDA. The major health-care costs (apart from drug acquisition costs) relate to joint replacement surgeries, hospital stays and doctor appointments.82
A range of classification systems and scales have been developed to measure and monitor disease activity in patients with RA, and scales commonly used to measure other domains, such as disability or activity level (e.g. HAQ), are also administered.26 Functional capacity measured with the HAQ was found to be the strongest predictor of costs.95 Therefore, direct medical costs for hospitalisations, joint replacements and the number of outpatient visits were included by HAQ-dependency, as explained below.
Resource utilisation in rheumatoid arthritis patients stratified by HAQ score
Barbieri et al.82 reported resource utilisation in people with RA treated with IFX, stratified by four HAQ bands (see Table 62, Appendix 18). These estimates were used by the authors to calculate the costs of managing people with RA beyond the first year of therapy, and were based on data from the Norfolk Arthritis Register (NOAR). The NOAR cohort includes 1236 adults who had swelling of at least two joints that had persisted for > 4 weeks. This study reported that, on average, the number of outpatient visits, hospital days and the proportion of patients undergoing joint replacement surgery increased substantially with HAQ score (see Table 62, Appendix 18).
Average cost of an inpatient day, outpatient appointment and joint replacement surgery derived from the relevant HRG codes from the NHS Reference Costs 2017–1884 are shown in Table 30 (the derivation of the cost of surgery for RA is explained in Cost of joint replacement surgery).
Mean HAQ scores for different levels of disease activity (remission, LDA, MDA and HDA) in people with RA were estimated by Radner et al.83 (Table 31): the mean HAQ score based on the SDAI was 0.39, the mean HAQ score for LDA was 0.72 and the MDA and HDA were characterised by a mean HAQ score of 1.24.
Using this classification and the cost estimates shown in Appendix 18, Table 62, the costs for managing remission (for scenario 1) and active disease (for scenario 2) were calculated from the corresponding probability density functions for HAQ scores weighted by the health management costs for different HAQ scores, whereas the costs of managing mixed-health states (LDA/active disease and remission/LDA) were derived from joint probability density functions for the relevant HAQ scores (see Appendix 18). The resulting average annual costs for managing remission, remission/LDA, LDA/active disease and active disease health states in people with RA were £902, £1089, £1483 and £1827, respectively.
Cost of joint replacement surgery
The weighted average cost of joint replacement surgery was £522284 per surgery, which was estimated from HRGs relevant to hip and knee procedures for non-trauma across all clinical codes (HN12–HN14 and HN22–HN24, respectively).
Burn et al.96 investigated hospital reimbursement for total knee replacement (TKR) and total hip replacement (THR) surgeries in NHS England between 1997 and 2014. Primary reimbursement for TKR and THR was approximately £6000 per surgery (2016/17 prices), whereas revision surgeries were approximately £8000 per surgery. These estimates were derived from the NHS primary care records of 21,128 people with osteoarthritis or RA. The authors reported on the downward trend in the costs of TKR and THR.
The average cost of joint replacement surgery in people with RA in the Royal Devon & Exeter NHS Foundation Trust is £5061.80 (standard error £5153) (see Appendix 19). This estimate was based on 15 surgeries that were conducted between April 2017 and September 2018. Of note, this estimate is slightly lower than those from the NHS Reference Costs 2017–1884 (see Table 30 and Burn et al.96). This might be a result of the trend in the cost of surgery reported by Burn et al.96 However, the sample size was very low and, therefore, this estimate may not be representative of the average cost of surgery in the RA patient population in the UK.
In all analyses, the annual costs of managing different health states were derived from the average cost of joint replacement surgery based on the HRGs from the NHS Reference Costs 2017–1884 (£5222 per joint replacement surgery; see Table 30).
In the analyses presented here, it was assumed, based on clinical advice, that surgery may be performed anywhere in the treatment pathway; however, the EAG is aware that older people are more likely to require surgery for RA.
Cost of managing flares
The cost of managing flares is another important consideration that needs to be parameterised in the model. A study by Maravic et al.80 estimated the costs associated with managing flares in people with RA in a French setting. This study focused on investigational costs and treatment costs; rheumatology appointments were not considered (see Table 64, Appendix 20).
The costs of diagnostic investigations per flare and the monthly cost of treatment (excluding bDMARDs)80 were converted to GBP based on PPP and inflated to 2017–18 prices using the HCHS pay and price index, resulting in costs of £423 and £68 for diagnostic investigations (per flare) and monthly treatment, respectively (see Table 27).
Cost of managing adverse events
In TA375,23 the weighted average cost of serious infection in RA patients was estimated to be £1479, based on relevant NHS costs97 and weighted by inpatient activity. Conservatively, HRG costs without complications and contraindications were used. This cost, inflated to 2017–18 prices using the HCHS pay and price index (£1622 per infection), was assumed in all analyses (see Table 27).
Health-related quality of life
A review of HRQoL studies was conducted to inform the selection of utilities for the economic analysis. Health-state utilities, as well as disutilities for flares and SAEs (such as severe infections), that were used in the analyses are described below.
Health state utility values
The abstracts reporting the INGEBIO study provided results on the average duration of either remission42 or remission/LDA43 in both the intervention and the control arms. However, none of the sources contain any definitions of remission.
A definition of remission was provided in Krieckaert et al.53 In this study, health states were based on the categorisation of DAS28 as below:
- remission – DAS28 of < 2.6
- LDA – 2.6 ≤ DAS28 of < 3.2
- MDA – 3.2 ≤ DAS28 of ≤ 5.1
- HDA – DAS28 of > 5.1.
The DAS28 comprises four components: counts of tender joints and counts of swollen joints (both performed by a clinician), the visual analogue scale (VAS) score of the patient’s global health and the laboratory parameter ESR. It has been shown, however, that CRP is more accurate as an indicator of inflammation than ESR, and it is also more sensitive to short-term changes.98 A modification of the DAS28, the DAS28-CRP,99 that includes the level of CRP instead of ESR was used in Bykerk et al.70 to define the disease activity types below:
- severe – DAS28-CRP of > 5.1
- moderate –3.2 ≤ DAS28-CRP of ≤ 5.1
- low –2.6 ≤ DAS28-CRP of < 3.2
- remission – DAS28-CRP of < 2.6.
In the study conducted by Bartelds et al.,55 remission was defined as a DAS28 of < 2.6 at all consecutive measurements after a certain time point, with a minimum of two scores of < 2.6 in the case of participants who discontinued treatment prematurely.
In Barnabe et al.,100 sustained remission was defined as DAS28 of ≤ 2.6 for more than 1 year, whereas non-sustained remission was defined as a DAS28 of ≤ 2.6 for less than 1 year.
In TA375,23 non-responders, moderate responders and good responders were defined according to the EULAR response criteria (see Table 3).
Health state utility value estimated from the Health Assessment Questionnaire according to Simplified Disease Activity Index, Clinical Disease Activity Index and Disease Activity Score in 28 joints
There are several composite scores to assess disease activity in RA. The definitions of the disease states (i.e. remission, LDA, MDA and HDA) according to the SDAI, the CDAI and the DAS28 from Aletaha et al.101 are presented in Table 32.
Radner et al.,83 at the Medical University of Vienna in Austria, collected data on clinical and laboratory characteristics (including CRP, ESR, the number of swollen and tender joints, pain by VAS, patient’s global assessment of disease activity, evaluator’s global assessment of disease activity and physical function by HAQ) from 356 consecutive people with RA at routine clinic visits (every 3–4 months). In total, 716 visits were documented, with a median of two clinic visits per person (ranging from one to four clinic visits).83 At baseline, 87 participants (24.4%) were in remission, 150 (42.1%) in LDA, 103 (28.9%) in MDA and 16 (4.5%) in HDA, as defined according to SDAI. Owing to the small number of participants in the HDA group, the MDA and HDA groups were combined in further analyses. The differences in functional disability measured by the HAQ scores at three levels of disease activity were evident, and similar conclusions were reached during a sensitivity analysis, when the disease states were assessed according to CDAI and DAS28 (see Table 31). Unless stated otherwise, in the remainder of this article, health states are assumed to be defined by SDAI.
The EAG is aware of several algorithms for converting the HAQ score to utility in RA, and that the estimates of utilities may vary when different mapping algorithms are used.102 In TA375,23 a comparison of published relationships between utility and HAQ was conducted (see figure 115 in Stevenson et al.34). Three of the eight compared studies reported data from the UK. Of these three studies, Bansback et al.103 included data for UK and Canadian patients, and Kobelt et al.104 included data for patients in the UK and Sweden; therefore, these were not considered relevant for the purposes of this analysis. Hurst et al.105 included people with RA in Scotland only. Malottki et al.59 used the data set from Hurst et al.105 to estimate the coefficients of their mapping equation; therefore, there is little difference between their estimates, despite different algorithms being used.
Throughout this monograph, the EQ-5D utility values were mapped from the HAQ scores using the same formula as in Malottki et al.:59
where a = 0.804, b1 = 0.203 and b2 = 0.045.
Hernández Alava et al.106 argued that pain should be included as an explanatory variable when estimating QALYs from HAQ scores in people with RA. This approach was used in TA375.23 However, the estimates presented in this article were obtained without pain scores because the EAG did not have access to patient-level data. Table 33 presents the EQ-5D utility values mapped from the HAQ scores at three levels of disease activity from Radner et al.83
Ucar et al.42 reported the mean duration of remission in the intervention and control arms. In the economic analysis based on this source, the EQ-5D utility value for remission, 0.718, was applied. The utility value of 0.568 used for a mixed-disease state (LDA/active disease) was approximated by the average of the estimates for LDA and MDA/HDA weighted by the proportion of patients in each health state from Radner et al.83 Of note, when the weighted average of HAQ scores was computed instead and mapped to EQ-5D, the utility value was very similar, 0.571.
As the health states in Arango et al.43 (remission/LDA and active disease) were defined differently from those in Ucar et al.,42 in analyses based on the former source,43 the EQ-5D utility score of 0.483 for MDA/HDA was used as the utility value for active disease health state, and the weighted average of the estimates for remission and LDA, 0.665, was used to approximate the utility value for the mixed-health state. When the alternative approach (described above) was used, the resulting utility value was 0.666.
Health state utility values (HSUVs) obtained from HAQ scores reported in Stevenson et al.34 (as described in the following section were assumed in scenario analyses.
Health state utility values estimated from the Health Assessment Questionnaire by European League Against Rheumatism response category
In TA375,23 the model was based on a EULAR response category (good/moderate/none) to be consistent with the NICE guidance on biologics in RA36 and to align more closely with UK clinical practice in terms of the assessment of response to therapies. The HAQ scores were estimated from the BSRBR-RA database,67 which contains values measured at 6-month intervals for up to 3 years for all people with RA on the register. The analysis conducted in TA37523 was restricted to those with the full set of baseline characteristics and at least two additional HAQ measurements while on bDMARDs. The database included data from 10,186 patients. Of these, 2417, 5492 and 2277 were classed as EULAR good responders, moderate responders and non-responders, respectively (see Table 3).
Figure 6 shows the HAQ trajectory in people with RA treated with bDMARDs. It was observed that the mean HAQ scores for patients with good, moderate or no response (according to the EULAR response criteria shown in Table 3) decreased during the first 6 months after the start of biological therapy (when the magnitude of decrease grows with the level of EULAR response), stabilised at around 6 months and remained rather flat over the remaining 2.5 years of measurement.
The HAQ scores that were measured after 6 months of therapy with biologics for all three categories of responders were mapped to EQ-5D utilities, which elicited the values shown in Table 34.
The utility for the remission health state was based on the utility value for good responders (0.496, see Table 34), whereas the utility for the LDA/active disease health state was estimated as the average of utility values for moderate responders and non-responders weighted by their proportions in the BSRBR-RA database, resulting in the utility value of 0.302. These HSUVs were used in sensitivity analyses.
Disutility of flare
The values of utility losses owing to flares were obtained from the Dutch multicentre clinical study ‘BeSt’,72 which involved 508 participants who were treated to target for 10 years to achieve a DAS28 of, at most, 2.4 (follow-up data that suffice to establish presence or absence of a flare during at least a single visit were available for only 480 patients).72 The BeSt study72 considered three types of flares, which were named as ‘A’, ‘minor B’ and ‘major B’ (where ‘major B’ is a subcategory of ‘A’), with the number of occurrences of each (observed during a total of 11,458 rheumatology visits of all patients) shown in Appendix 21, Figure 10, and the definitions, frequencies and HAQ scores of each described in Table 35. The mean HAQ score of patients with no flare at a visit was estimated as 0.53 (SD 0.56).
Functional mobility of patients with these types of flares was measured using HAQ scores (mean and SD values are also included in Table 35). The loss of QALYs was computed as the difference between the mapped HSUVs of patients with each type of flare and the mapped HSUVs of patients in the absence of flares. The estimated disutility values are shown in Table 35.
Disutility of serious adverse events
People with RA have increased susceptibility to serious infections owing to the features of RA, comorbidity and immunosuppressive treatment.107 It has been shown that TNF-α inhibitors increase the risk of serious infection up to two-fold.108 The EuroQol-5 Dimensions three-level version (EQ-5D-3L) disutility value for England of 0.156 over 4 weeks (equivalent to the loss of QALYs of 0.012) that is associated with severe infections was reported in the observational study ‘Genomics to combat Resistance against Antibiotics in Community-acquired lower respiratory tract infections (LRTI) in Europe’ (GRACE) of the management of patients with acute cough/LRTI in primary care.85 Data were collected in 13 European countries (including England and Wales) from adults (aged ≥ 18 years) who reported to their primary care clinicians with cough and LRTI.85 EQ-5D-3L scores were generated using the country-specific UK value set, in which the original data were collected from non-institutionalised adults in England, Scotland and Wales (with a total of 2997 participants) between August and December 1993.
The effect of SAEs on costs and QALYs was modelled in the primary analyses. It should be noted, however, that in the analyses, assuming that TDM affects the duration of remission/LDA and the rates of flares and AEs, there is a risk of double-counting the effect of flares and AEs on HRQoL given that it is possible that the disutilities have already been incorporated into health-state utility values.
Consistency between utility values
As shown in Gülfe et al.,109 there may be discrepancies between utility values that are measured in different countries (in our case Spain, Austria, the Netherlands and the UK), which may occur owing to differences in distinct preference sets for those countries. Figure 11 (see Appendix 21) shows EQ-5D-3L scores obtained using British and Swedish preference sets for people with established RA being treated with TNF-α inhibitors.
The population considered in the INGEBIO study was mixed. This trial recruited 169 people, 63 with RA (37.3%), 54 with PsA (32%) and 52 with AS (30.8%). Gülfe et al.110 also studied a mixed population, with two (RA and PsA) out of the three diseases the same as in the INGEBIO study; the third disease was SpA, which is usually considered as a phenotypically heterogeneous disease with PsA and AS as its best-studied manifestations.111 One of the aims of this study110 was to analyse trends in health utilities in people diagnosed with three types of arthritis: 2554 people with RA (who constituted 68.8% of the total population), 574 with PsA (15.5%) and 586 with SpA (15.8%), who started treatment with TNF-α inhibitors. Data for the period from May 2002 to December 2008 were obtained from the Southern Sweden Arthritis Treatment Group register, which was set up in 2002 and collects health utility data from routine clinical follow-up. Treatment courses are classified as first, second or third or further TNF-α inhibitor. Among the three subpopulations, people with RA were typically older, had tried more DMARDs, were more often treated with a concomitant DMARD and were more often female than the other populations. Figure 12 (see Appendix 21) shows similar response patterns in people with RA, PsA and SpA at 6 months after the start of the first TNF-α inhibitor treatment course.110
In Arango et al.,43 19 patients who discontinued treatment were excluded from the analysis, although those patients were included in the ITT analysis reported in Ucar et al.42 As shown in Gülfe et al.,110 RA patients who terminated therapy for any reason had demonstrated lower utility gain by the time of withdrawal, which is illustrated in Appendix 21, Figure 13.
Although the use of all available data increases the generalisability of the study, it may also lead to lower utility estimates than when using data for only those participants for whom complete follow-up information is available (see Figure 14), as incomplete records may be a result of, for example, withdrawals from treatment owing to adverse effects of the intervention.
Of note, the utility values reported in this section were not used in the economic analyses.
Mortality
Although there is evidence of an association between HAQ improvement and reduced mortality risk, the impact of TNF-α testing on mortality was not considered owing to the short-term time horizon adopted in this study and the relatively small difference in the mean duration of remission42 and remission/LDA43 across the treatment arms in INGEBIO.
Checking the model for wiring errors
The model written in Microsoft Excel® 2013 (Microsoft Corporation, Redmond, WA, USA) was checked in the following way: all calculations were performed by one person and were checked by another person.
Results
Primary analyses: adalimumab (Humira) and Promonitor
Threshold analyses
Threshold analyses were conducted for both Ucar et al.42 and Arango et al. (Table 36).43
The results suggest that, if the outcomes reported in Ucar et al.42 are used, then, under the list price of Humira, the cost of testing per patient would need to be less than £225 per year in order for TDM to be judged as a cost-effective option at the thresholds of £20,000 per QALY gained; for the threshold of £30,000 per QALY gained, the cost of testing should be below £274 per patient-year. For the lower bound, with the annual acquisition cost of £1000 per patient-year, the corresponding threshold values for the cost of testing were £197 and £246 per patient-year.
For the outcomes reported in Arango et al.43 and at the list price of Humira, the cost of testing should not exceed £18 per year to be considered as cost-effective at the threshold of £20,000 per QALY gained. However, the other threshold values obtained for outcomes reported in this source were negative (see Table 36). This means that, when using the trial results as presented in Arango et al.,43 there are no (positive) values of the cost of testing at which it would be a cost-effective option at £30,000 per QALY gained, as well as for the lower ADL acquisition cost of £1000 per patient-year.
The qualitatively different results obtained in the threshold analyses can be explained by the difference in the mean duration of remission42 and remission/LDA43 between the control and the intervention arms. As reported in Arango et al.,43 patients from the control group were in remission/LDA for longer, on average, than patients in the intervention group (475.2 days vs. 460.2 days), whereas Ucar et al.42 reported a longer duration of remission in patients in the intervention group than in the control group (344 days vs. 329 days).
The results of the threshold analyses are inconclusive for two reasons: they are inconsistent and they are based on very small and uncertain differences in outcomes, with the incremental QALYs of < 0.01.
Cost-effectiveness analyses
As in the threshold analyses, economic results were obtained for outcomes from both reports of the INGEBIO study.42,43 The incremental costs and QALYs for testing versus SOC (Table 37) were estimated assuming that:
- patients are treated with Humira and are tested regularly using Promonitor assays
- the frequency of testing is one test per patient per year
- testing of drug and antibody levels is carried out concurrently (single dilution) at a UK laboratory
- the other testing costs are as reported in Jani et al.37
As shown in Table 37, the major cost components in both the intervention and the control arms were the drug acquisition costs and the costs of managing health states, whereas the incremental costs were mostly driven by the cost of the initial phlebotomy appointment and the cost of managing flares. The incremental QALYs, defined primarily by QALYs accrued in different health states, were very small (of the order < 0.01). The ICER in scenario 1 (based on Ucar et al.42) was £5575 per QALY gained, whereas in scenario 2 (based on Arango et al.43) the results suggest that SOC dominated the intervention.
The results of the cost–utility analyses are inconclusive: using data from Ucar et al.42 and Arango et al.43 produced qualitatively different results, which were based on very small and uncertain differences in outcomes (with incremental QALYs of < 0.01).
Sensitivity analyses: adalimumab (Humira) and Promonitor
A number of sensitivity analyses were undertaken to explore the impact of parametric and structural uncertainty on the cost-effectiveness outcomes reported in Table 37.
One-way deterministic sensitivity analyses
Uncertainty in some of the parameters that were used to estimate the ICERs in scenario 1 and scenario 2 (detailed in Table 37) was evaluated in one-way deterministic sensitivity analyses (Table 38).
In the analysis, assuming a 20% increase and 20% decrease in the proportion of patients on tapered doses in the intervention and control arms, respectively, the intervention was less costly and less effective, with the ICER of £28,570 per QALY gained located in the south-west quadrant of the cost-effectiveness plane.
Reducing the flare rate in the intervention arm by 20% and increasing it by the same amount in the control arm resulted in negative incremental costs and QALYs (see Table 38), with an ICER of £15,867 per QALY gained.
When the costs of managing health states were reduced by 20%, SOC was dominant.
The same outcome was obtained when the time in remission/LDA in the intervention and control arms was varied by +10% and –10% of the differential time in remission/LDA across the treatment arms, respectively.
Probabilistic sensitivity analyses
Probabilistic sensitivity analyses were not conducted because of time constraints and the lack of clarity as to which model assumptions would be most relevant to the NHS, owing to a substantial variation in clinical practice with respect to disease management in people with RA, as well as uncertainty in the TNF-α testing strategies (given that therapeutic monitoring for RA is not currently part of NHS practice). These variations were explored in numerous clinically relevant scenario analyses detailed in the following sections.
Scenario analyses
Impact of therapeutic drug monitoring on flare rate only
In the sensitivity analyses, assuming that TNF-α monitoring affects the rate of flares only in patients treated with biologics (as in Gavan17), the ICERs in scenario 142 and scenario 243 were £95,070 and £29,599 per QALY, respectively (see Table 39).
When this assumption was implemented in exploratory analyses for the other TNF-α inhibitors (see Exploratory analyses: etanercept or infliximab and Promonitor and Table 41), ICERs were either very close to £30,000 per QALY gained or well above this cost-effectiveness threshold.
Impact of the cost of the initial phlebotomy appointment
Scenario analyses were conducted that assumed that trough samples are taken at the time of existing doctor appointments (i.e. a phlebotomy appointment would not be required). The costs for reflex or concurrent, single or duplicate testing implemented in these analyses are shown in Table 61. In scenarios with reflex testing, it was assumed that the proportion of patients who would need to undergo antibody testing was either 4.7% or 35.8% (see Reflex versus concurrent testing).
When the cost of phlebotomy appointments was implemented together with the other assumptions on testing (as described above), the ICERs were under £20,000 per QALY gained in all analyses for Ucar et al.,42 whereas SOC dominated the intervention in the analyses parameterised from Arango et al.43 (Table 39).
However, when this cost was excluded, TDM dominated SOC in all analyses based on Ucar et al.,42 whereas the intervention was less costly and produced fewer QALYs than SOC in all analyses for Arango et al.43 (see Table 39), with ICERs of under £20,000 per QALY gained.
Proportion of flared patients on tapered doses, whose treatment dose would be restored to full
A US study70 reported statistics on flare that showed that at least 45% of treatment strategies for coping with flares did not involve a dose increase or any other change of medication. Dr Haigh (our clinical advisor) (Royal Devon & Exeter NHS Foundation Trust, Exeter, 2018, personal communication) confirmed that in about only two-thirds of all flared patients on tapered doses would the dose be switched back to full.
Therefore, the effect of the flare management strategy outlined in Bykerk et al.,70 that is the assumption that in only 55% of flared patients would the dose of ADL be fully restored, was evaluated. Another assumption, that all patients who flared while on tapered doses would stay on the same dose,112,113 was also tested. The resulting ICERs were under £20,000 per QALY gained in the analyses for scenario 1,42 whereas SOC dominated TDM in the analyses for scenario 2 (see Table 39).43
The number of tumour necrosis factor alpha tests per patient-year
Under the assumption of 6-monthly testing, SOC was dominant in scenario 2 and the ICER in scenario 1 was £36,756 per QALY gained (see Table 39).
Discounts for the price of Humira®
One-way deterministic sensitivity analyses were conducted based on data from Ucar et al.42 and Arango et al.,43 in which the Humira acquisition cost was reduced by 20–80% (Table 40).
Regardless of the assumed reduction in the ADL acquisition cost, SOC was estimated to dominate the intervention when data from Arango et al.43 were used, whereas the ICERs in the analyses based on Ucar et al.42 were under £20,000 per QALY gained (see Table 40).
Discounts for the price of Promonitor assays
The costs for the Promonitor test kits assumed in the economic analyses are shown in Appendix 15. Grifols–Progenika also offers price discounts, which depend on the uptake of testing, single or duplicate testing, concurrent or reflex, with different number of tests per year. Therefore, additional cost–utility analyses for the levels of discounts proposed by the company were also conducted (the results are not reported here).
Other scenario analyses
The other sensitivity analyses conducted are listed below:
- tapering strategy of dose halving (see Dose tapering)
- cost of treatment wastage assumed to be zero (see Treatment wastage)
- mean flare duration of 19 days (see Duration of flare)
- health-state utilities estimated from TA375 (see Health state utility values estimated from the Health Assessment Questionnaire by European League Against Rheumatism response category)
- disutilities for major B and minor B flares as defined by Markusse et al.72 (see Disutility of flare).
The resulting ICERs were under £20,000 per QALY gained in all analyses for scenario 1 (based on Ucar et al.42), whereas SOC dominated the intervention in all analyses for scenario 2 (parameterised from Arango et al.43) (see Table 39).
Exploratory analyses: etanercept or infliximab and Promonitor
The cost-effectiveness of TNF-α testing in RA patients treated with the ETN originator product (Enbrel) or its biosimilar (Erelzi), or IFX biosimilars (Flixabi or Renflexis), using Promonitor test kits was evaluated in exploratory analyses. Information on the actual costs to the NHS of these TNF-α inhibitors was not available to the EAG at the time of writing and, therefore, the list prices of the biologics were assumed.
Based on the list prices (see Table 28), Enbrel has the highest acquisition cost per patient-year among the TNF-α inhibitors that are administered subcutaneously, whereas Erelzi has the lowest cost. Therefore, by considering these two treatments, we covered the whole spectrum of acquisition costs of the TNF-α treatments with subcutaneous route of administration. Flixabi and Renflexis have the lowest acquisition cost among the treatments administered intravenously (see Table 28). However, these biologics incur substantial administration costs (as described in Drug administration) and, therefore, it was important to evaluate the impact of intravenous administration on the cost-effectiveness of TDM.
In these exploratory analyses, the clinical effectiveness of TDM in RA patients who were receiving the TNF-α inhibitors (including their biosimilars) was assumed to be the same, as was the performance of the Promonitor assays when measuring drug and antibody levels for different biologics; these simplified assumptions were made owing to lack of evidence. Therefore, the clinical outcomes from Ucar et al.42 and Arango et al.43 were adopted with all model assumptions, except the acquisition and administration costs and the cost of treatment wastage, as shown in Table 27. The results are presented in Table 41.
As in the previous analyses, the outcomes were dependent on the evidence used for model parameterisation: SOC was dominant when the clinical outcomes were taken from Arango et al.,43 whereas the results based on Ucar et al.42 signified that the intervention was likely to be cost-effective, with ICERs well under £20,000 per QALY gained.
Importantly, when assuming that TDM solely affects flare rate, the ICER for Enbrel was slightly under £30,000 per QALY in the analysis using the data from Arango et al.,43 whereas in all other analyses ICERs exceeded this threshold significantly (see Table 41).
Other scenario analyses considered but not conducted owing to a lack of clinical data were analyses of testing in the context of primary or secondary non-response, and analyses for non-responders who did not adhere to treatment with the biological therapies, including switching to intravenously administered IFX.
Consideration of a publication by l’Ami et al.
An addendum was produced in response to a request from the NICE technical team for an exploratory analysis that considered a scenario in which the drug dose in the standard care arm was not reduced (or reduced less than in the intervention arm). This was requested because, during scoping for the appraisal, the stakeholders indicated that dose reductions are currently not part of routine care in large parts of the UK. The NICE technical team requested that the EAG consider using data from l’Ami et al.114
The study was identified in the searches for the clinical effectiveness systematic review but did not meet the inclusion criteria specified in the protocol, and was excluded on comparator because the physicians in the control arm had knowledge of drug and anti-drug antibody levels to make their judgements (see Appendix 22).
Analyses based on additional evidence provided by Grifols–Progenika
After the original EAG’s report had been submitted to NICE, Grifols–Progenika provided additional evidence from INGEBIO on the average number of days in remission for the same follow-up period as in Arango et al.43 Analyses based on this evidence were conducted by the EAG (see Appendix 23). The results suggest that the intervention dominated SOC.
Exploratory analyses based on the INGEBIO full study report provided by Grifols–Progenika
Exploratory analyses considering additional evidence, the INGEBIO full study report provided by Grifols–Progenika, were conducted (see Appendix 24).
When the company’s modelling approach was used, depending on the model assumptions, the intervention was either dominant or cost-effective at the threshold of £20,000 per QALY gained (see Tables 73 and 74). However, when the updated EAG model was utilised, results varied from the intervention being dominant to ICERs exceeding £160,000 per QALY gained, located in the north-east quadrant of the cost-effectiveness plane (see Tables 77–80).
Discussion
The results of the primary, sensitivity and exploratory analyses suggest that the cost-effectiveness of TDM versus SOC in RA patients receiving TNF-α inhibitors is highly uncertain. Data from two reports of the same study (INGEBIO) produced inconsistent conclusions on the cost-effectiveness of Promonitor ELISA testing in RA patients in remission or LDA, receiving ADL treatment.
In the primary cost–utility analyses (assuming one test per year carried out concurrently at a UK laboratory, with one phlebotomy appointment per test), SOC was found to be dominant based on the longer follow-up,43 whereas using data for the shorter follow-up42 produced the ICER of £5575 per QALY gained.
The intervention dominated SOC in scenario analyses that excluded the cost of phlebotomy appointments, which were based on Ucar et al.,42 and was likely to be cost-effective in those sensitivity analyses parameterised from Arango et al.43 When the cost of phlebotomy appointments was factored in, the intervention was either dominated by SOC or likely to be cost-effective, depending on the data source used.42,43
Under the assumption of 6-monthly testing, SOC dominated the intervention in the analysis based on Arango et al.43 and the ICER for Ucar et al.42 was £36,756 per QALY gained.
When assuming that the rate of flares alone is affected as a consequence of monitoring, the ICERs were £95,070 and £29,599 per QALY gained depending on the data source used (Ucar et al.42 or Arango et al.,43 respectively). In the former scenario, TDM was highly unlikely to be cost-effective, whereas in the latter the ICER of TDM was only slightly under the WTP of £30,000 per QALY gained.
In the majority of other sensitivity analyses conducted, ICERs were under £20,000 per QALY gained when estimated from Ucar et al.,42 whereas SOC dominated the intervention in all analyses parameterised from Arango et al.43
In the exploratory analyses based on the INGEBIO full study report, the outcomes were also inconsistent and varied from the intervention being dominant to ICERs exceeding £160,000 per QALY gained.
Therefore, based on available evidence, the economic results are inconclusive and suggest that there is considerable uncertainty in the cost-effectiveness of TDM in RA patients in England and Wales.
- Independent economic assessment - Enzyme-linked immunosorbent assays for monitor...Independent economic assessment - Enzyme-linked immunosorbent assays for monitoring TNF-alpha inhibitors and antibody levels in people with rheumatoid arthritis: a systematic review and economic evaluation
- Discussion - Sixteen-week versus standard eight-week prednisolone therapy for ch...Discussion - Sixteen-week versus standard eight-week prednisolone therapy for childhood nephrotic syndrome: the PREDNOS RCT
- Background - Antidepressants for pain management in adults with chronic pain: a ...Background - Antidepressants for pain management in adults with chronic pain: a network meta-analysis
- Conclusions - Different strategies for pharmacological thromboprophylaxis for lo...Conclusions - Different strategies for pharmacological thromboprophylaxis for lower-limb immobilisation after injury: systematic review and economic evaluation
- Conclusions - Coenzyme Q10 to manage chronic heart failure with a reduced ejecti...Conclusions - Coenzyme Q10 to manage chronic heart failure with a reduced ejection fraction: a systematic review and economic evaluation
Your browsing activity is empty.
Activity recording is turned off.
See more...