U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Clinical and Economic Review Report: Vedolizumab (ENTYVIO SC): Takeda Canada Inc. Indication: Crohn disease [Internet] Ottawa (ON): Canadian Agency for Drugs and Technologies in Health; 2021 Apr.

Cover of Clinical and Economic Review Report: Vedolizumab (ENTYVIO SC)

Clinical and Economic Review Report: Vedolizumab (ENTYVIO SC): Takeda Canada Inc. Indication: Crohn disease [Internet]

Show details

Appendix 1Description and Appraisal of Outcome Measures

Aim

To summarize the measurement properties (e.g., reliability, validity, minimally clinically important difference [MCID]) of the following outcome measures used in the VISIBLE 2 study:

  • CDAI
  • IBDQ
  • EQ-5D-3L.

Findings

Table 31Summary of Outcome Measures and Their Measurement Properties

Outcome measureTypeConclusions about measurement properties total scoreMCID
CDAIPhysician-evaluated 8-item CD specific index used to assess CD severityValidatedNA
IBDQPhysician-administered 32-item questionnaire used to assess HRQoL in patients with IBDValidated16
EQ-5D-3LPatient-reported generic quality-of-life instrumentValidatedVAS 8.2

CD = Crohn disease; CDAI = Crohn’s Disease Activity Index; EQ-5D-3L = EuroQol 5-Dimensions 3-Levels questionnaire; HRQoL = health-related quality of life; IBD = inflammatory bowel disease; IBDQ = Inflammatory Bowel Disease Questionnaire; NA = not applicable; VAS = Visual Analogue Scale.

CDAI

The National Cooperative Crohn’s Disease Study Group developed the CDAI using prospective data from 187 visits of 112 patients suffering from CD.24 It is a disease-specific index and considered the standard for assessing CD activity. The CDAI consists of 8 domains that are used to evaluate overall disease severity. The overall score is based on the sum of the weighted value of each item and ranges from 0 to 600, where a score of 150 is defined as the threshold between remission and active disease. Scores ranging between 150 and 219 indicate mild to moderate CD and scores ranging between 220 and 450 indicate moderate to severe CD, whereas scores above 450 indicate very severe CD.25,26 Item scores are derived using patient diaries, which are based on the 7 days preceding each visit. Generally, the CDAI is considered impractical for use in clinical practice, with no clearly defined MCID.2628 Originally, changes of 50 points in the CDAI were associated with physician evaluation of “slightly better” and/or “slightly worse” compared to baseline.24,26,28 However, clinical trials commonly define a change of 50, 60, 70, or 100 points in CDAI as a clinical response.26 More recently, the FDA and EMA have suggested that a change of 100 points in CDAI is considered to be a more meaningful response (i.e., enhanced clinical response).26

Development of the CDAI

Gastroenterologists considered 18 parameters to inform the CDAI, including the following CD domains: including subjective patient symptoms and need for symptomatic medications; objective clinical findings on physical examination; extra-intestinal manifestations of CD; complications of CD (e.g., fistulas); radiologic and endoscopic examinations; and laboratory parameters. A global assessment score was also assessed at each visit by the gastroenterologist based on the following scheme: “very well” = 1, “fair to good” = 3, “poor” = 5, “very poor” = 7.

Multiple regression and backwards stepwise deletions were utilized to assess the correlation between the 18 parameters and the physician global assessment score. Based on the results of the correlations, 8 independently weighted (weighting ranges from 1 to 30) variables were included in the final CDAI formula.

Table 32Final Items Included in the CDAI and Their Weights

Item (daily sum per week)Weight
Number of liquid or very soft stools2
Abdominal pain score in one week (rating: 0 to 3)5
General well-being (rating: 0 to 4)7
Sum of findings per week:
  • Arthritis/arthralgia
  • Mucocutaneous lesions (egg, erythema nodosum aphthous ulcers)
  • Iritis/uveitis
  • Anal disease (e.g., fissure, fistula)
  • External fistula (e.g., enterocutaneous, vesicle, vaginal)
  • Fever > 37.8°C
20
Antidiarrheal use (e.g., diphenoxylate hydrochloride)30
Abdominal mass (none = 0, equivocal = 2, present = 5)10
47 – hematocrit (males) or 42 – hematocrit (females)6
100 × (1 – [body weight ÷ standard weight])1

Source: Best et al. (1976).24

Reliability of the CDAI

Reliability was not originally assessed during the development of the CDAI; however, the index did provide good to very good test-retest reliability based on 2 successive visits involving 32 patients.24,25 The CDAI was subsequently re-evaluated and re-derived using data collected from 1,058 patients and demonstrated little difference compared to the original formulation; therefore, the original version was recommended.29

Validity of the CDAI

Construct validity: The items included in the CDAI were selected by gastroenterologists and are based on accepted features of CD, therefore demonstrating construct validity.25

Content validity: The CDAI appears to be responsive as it allows detectible changes in CD severity to be measured (i.e., the CDAI is able to differentiate levels of CD severity). Additionally, the CDAI appears to be widely utilized in clinical trials and is an accepted measure by gastroenterologists as a primary end point to assess CD activity. In contrast, the CDAI does not appear to be reflective of CD activity for pediatric patients suffering from CD nor does the instrument address all aspects of CD, such as quality of life.25

Criterion validity: Selecting a gold-standard measure for comparison is difficult when considering CD due to the heterogeneous nature of its manifestations. Generally, the CDAI does not demonstrate any significant correlation between the overall score and objective measurements such as mucosal healing. However, the lack of correlation may not be indicative of a lack of criterion validity due to the multifaceted nature of CD.25 Predictability is another component of criterion validity. One study demonstrated that CDAI scores increased 2 months preceding exacerbations of CD and decreased one month following exacerbations of CD, therefore demonstrating criterion validity.25

Limitations of the CDAI

The CDAI scores appear to vary depending on the observers’ reviews, despite the evaluation of the same case histories.30 In addition, the overall CDAI scores are based on subjective items such as “general well-being” and “intensity of abdominal pain” based on patient perception.

IBDQ

The IBDQ is a physician-administered questionnaire developed by Guyatt et al.31,32 to assess HRQoL in patients with IBD (e.g., UC and CD).33 It is a 32-item Likert-based questionnaire, divided into 4 dimensions (i.e., bowel symptoms [10 items], systemic symptoms [5 items], emotional function [12 items], and social function [5 items]). Patients are asked to recall symptoms and quality of life from the last 2 weeks, with responses graded on a 7-point Likert scale (1 being the worst situation, 7 being the best) with the total IBDQ score ranging between 32 and 224 (i.e., higher scores representing better quality of life). Scores of patients in remission typically range from 170 to 190.

This questionnaire has been validated in a variety of settings, countries, and languages.33 A review33 of nine validation studies on the IBDQ in patients with IBD reported that the IBDQ was able to differentiate clinically important differences between patients with disease remission and patients with disease relapse. In a randomized placebo-controlled trial on patients with UC, the IBDQ was able to discriminate changes in the social and emotional state of patients.32 The IBDQ has demonstrated high test-retest reliability in all 4 dimensional scores. Six studies evaluated the IBDQ for sensitivity to change and all found that changes in HRQoL correlated to changes in clinical activity in patients with CD.33

A study conducted by Gregor et al.34 noted that a clinically meaningful improvement in quality of life would be an increase of at least 16 points in the IBDQ total score or 0.5 points or more per question in patients with CD.

EQ-5D-3L

The EQ-5D is a generic HRQoL instrument that can be applied to a wide range of health conditions and treatments.35,36 The first of 2 parts of the EQ-5D is a descriptive system that classifies respondents (aged ≥ 12 years) based on the following 5 dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. The EQ-5D-3L has 3 possible levels (1, 2, or 3) for each domain, representing “no problems,” “some problems,” and “extreme problems,” respectively. Respondents are asked to choose the level that reflects their health state for each of the 5 dimensions, corresponding with 243 different health states. A scoring function can be used to assign a value (EQ-5D-3L index score) to self-reported health states from a set of population-based preference weights.35,36 The second part is a 20 cm VAS that has end points labelled 0 and 100, with respective anchors of “worst imaginable health state” and ‘”best imaginable health state.” Respondents are asked to rate their health by drawing a line from an anchor box to the point on the EQ-VAS which best represents their health on that day. Hence, the EQ-5D produces 3 types of data for each respondent:

  • a profile indicating the extent of problems on each of the 5 dimensions represented by a 5-digit descriptor, such as 11121 or 33211
  • a population preference-weighted health index score based on the descriptive system
  • a self-reported assessment of health status based on the VAS.

The EQ-5D index score is generated by applying a multi-attribute utility function to the descriptive system. Different utility functions are available that reflect the preferences of specific populations (e.g., US or UK). The lowest possible overall score for the EQ-5D-3L version (corresponding to severe problems on all 5 attributes) varies depending on the utility function that is applied to the descriptive system (e.g., −0.59 for the UK algorithm and −0.109 for the US algorithm). Scores of less than 0 represent health states that are valued by society as being worse than dead, while scores of 0 and 1.00 are assigned to the health states “dead” and “perfect health,” respectively. Reported MCIDs for the 3-level version of the scale range from 0.033 to 0.074.37

Studies are emerging supporting the validity of the EQ-5D in patients with IBD, including CD. Both the EQ-VAS and EQ-index scores were found to correlate well with disease activity indices and differed significantly between patients with active disease and remission. Test-retest reliability was high. The EQ-VAS was more responsive to deterioration in health than improvement in health and tended to be more responsive than EQ-index scores.38

A study by Coteur et al.39 explored MCID estimates within the CD patient population using data from multinational, multi-centre, double-blind, placebo-controlled, parallel-group clinical trials in which clinical remission of CD was assessed using the CDAI measure as the primary outcome. Secondary outcomes included the IDBQ and EQ-5D VAS score. All end points were measured at weeks 0, 6, 16, and 26 using standardized procedures. Six estimates of MCID were evaluated for the EQ-5D VAS score to determine the most appropriate measure to use as the anchor: 2 analyses utilizing anchor-based methods and 4 analyses utilizing distribution-based methods. For the anchor-based estimates, a linear regression was performed using the 2 anchors and the CDAI and IBDQ. The MCID estimates for the EQ-5D VAS score were then extracted from the regression equations, with a change of 16 points for the IBDQ total score or a score change of 50 points for the CDAI score considered meaningful. For distribution-based estimates, measures rely on the statistical distributions of HRQoL data, and include effect size measures (0.2 and 0.5 were used and suggested as small to moderate effect sizes), the standard error of measurement, and the standard error of the difference. Overall, the MCID for the EQ-5D VAS score ranged from 4.2 to 14.8, depending on the approach. Because changes in the EQ-5D VAS score showed greater correlations with score changes in the IBDQ than with CDAI, the IBDQ was selected as the best anchor, with a corresponding MCID of 8.2. The values derived by the IBDQ anchor-based method were similar to the values obtained by the distribution-based methods and were representative of small to moderate effect sizes.

Copyright © 2021 Canadian Agency for Drugs and Technologies in Health.

The copyright and other intellectual property rights in this document are owned by CADTH and its licensors. These rights are protected by the Canadian Copyright Act and other national and international laws and agreements. Users are permitted to make copies of this document for non-commercial purposes only, provided it is not modified when reproduced and appropriate credit is given to CADTH and its licensors.

Except where otherwise noted, this work is distributed under the terms of a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International licence (CC BY-NC-ND), a copy of which is available at http://creativecommons.org/licenses/by-nc-nd/4.0/

Bookshelf ID: NBK572473

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (2.1M)

In this Page

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...