U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Clinical Review Report: Halobetasol Propionate and Tazarotene (Duobrii): (Bausch Health, Canada Inc.): Indication: Psoriasis, moderate-to-severe plaque [Internet]. Ottawa (ON): Canadian Agency for Drugs and Technologies in Health; 2020 Dec.

Cover of Clinical Review Report: Halobetasol Propionate and Tazarotene (Duobrii)

Clinical Review Report: Halobetasol Propionate and Tazarotene (Duobrii): (Bausch Health, Canada Inc.): Indication: Psoriasis, moderate-to-severe plaque [Internet].

Show details

Appendix 5Description and Appraisal of Outcome Measures

Aim

To describe the outcome measures summarized in Table 55, and review their measurement properties including validity, reliability, responsiveness to change, and MID.

Of the four outcome measures, the IGA was described in greater detail as this was the primary end point under review in Study 301 and Study 302. Validation of BSA and the generic tool, DLQI, was included. Of note, limited information was available for the psoriasis signs evaluation tool.

Findings

The validity, reliability, and responsiveness of each outcome measure were summarized and evaluated. Interpretation of the reliability and validity metrics were based on the following criteria:

  • Inter-rater reliability, kappa statistics (level of agreement):44
    • less than 0 = poor agreement
    • 0.00 to 0.21 = slight agreement
    • 0.21 to 0.40 = fair agreement
    • 0.41 to 0.60 = moderate agreement
    • 0.61 to 0.8 = substantial
    • 0.81 to 1.00 = almost perfect agreement.
  • Internal consistency (Cronbach’s alpha) and test-retest reliability: 0.7 or greater is considered acceptable45
  • Validity (i.e., between-scale comparison [correlation coefficient, r]):46
    • 0.3 or less = weak
    • 0.3 to 0.5 or more = moderate
    • greater than 0.5 = strong.

Table 55Summary of Outcome Measures and Their Measurement Properties

Outcome measureTypeConclusions about measurement propertiesMID
DLQI10-item, dermatology-specific quality of life questionnaire to assess limitations related to the impact of skin disease. The response options range from 0 (not affected at all) to 3 (very much affected) and DLQI scores range from 0 to 30, with lower scores indicating better quality of life.

Validity: Construct validity of the DLQI in the psoriasis population was based on correlation of the instrument with either generic, dermatologic, or disease-specific instruments over 37 separate studies;38 the DLQI corelated the greatest with the bodily pain (r = 0.61) and social functioning domains (r = 0.68) of the SF-36, as well as the overall EQ-5D index score (r = 0.71).34

Reliability: Reliability was assessed in the original validation study of the DLQI by Finlay and Khan in a population of various skin diseases;33 the test-retest reliability correlation coefficients were high for both the overall score (Spearman rank correlation = 0.99) and for individual questions (0.95 to 0.98).33 Slightly lower correlation coefficients (ranging from 0.56 to 0.99) were reported in a later systematic review by Basra et al.38

Responsiveness: Responsiveness to change was measured by comparing DLQI data with PASI and PGA scores.34 The DLQI demonstrated equal responsiveness to the PASI and PGA scores with correlation coefficients of r = 0.69 and r = 0.71, which was not achieved by the general tools, the EQ-5D (r = 0.44) and SF-36 (r = 0.44).34

The MID of the DLQI in patients with psoriasis was estimated using 3 anchor-based methods. Estimates ranged from 2.2 to 6.9.34

Another study in patients with psoriasis treated with adalimumab reported an MID of 3.2.47

In the most recent systematic review of RCTs in psoriasis, the DLQI MID was reported to be a score change of 5.48

IGA5-point scale used to measure the severity of disease at a single point in time (static IGA); IGA scores range from 0 (clear) to 4 (severe).There are no studies evaluating the validity, reliability, or responsiveness of the 5-point IGA scale.The MID has not been identified at this time.
Psoriasis signs5-point scale used to measure the severity of the signs of psoriasis (erythema, plaque elevation, and scaling); scores for each scale can range from 0 (none) to 4 (severe).There is no information regarding the validity, reliability, or responsiveness to change of this scale.

There is no scientific literature regarding the MID of this scale.

The clinical expert consulted in this review noted that a 2-grade improvement within this scale for any of the signs of psoriasis (erythema, plaque elevation, and scaling) is a clinically meaningful difference.

BSAPercentage BSA affected by psoriasis is estimated through the 1% rule, where the subject’s flat palm represents 1% of total BSA.

Validity: This is not relevant to the evaluation of BSA.

Reliability: Inter-rater reliability was evaluated in two studies which determined an ICC of 0.9149 and 0.9650 when dermatologists used the 1% rule in BSA determination.

Inter-rater variability was determined to be high in 2 separate studies:

  • a systematic review conducted by Puzenat et al. determined CV > 30%51
  • Bozek et al. found a CV = 57.1 when 10 dermatologists evaluated 9 patients.52
Test-retest reliability was evaluated in 2 separate studies; high test-retest reliability was found in both studies with an ICC of 0.9849 and 0.96.52

Responsiveness: Currently, there is no evidence regarding the responsiveness to change of the use of the 1% rule in BSA determination.

There is no scientific literature regarding the MID of BSA affected by psoriasis.

The clinical expert consulted in this review noted that a 1/3 reduction in BSA affected by psoriasis is a clinically meaningful difference.

BSA = body surface area; CV = coefficient of variation; DLQI = Dermatology Life Quality Index; EQ-5D = EuroQol 5-Dimensions; ICC = intraclass correlation coefficient; IGA = Investigator’s Global Assessment; MID = minimal important difference; PASI = Psoriasis Area Severity Index; PGA = Physician’s Global Assessment; RCT = randomized controlled trial; SF-36 = Short Form (36) Health Survey.

Source: Reolid et al.,49 Chandran et al.,50 Puzenat et al.,51 Bozek et al.,52 Basra et al.,38 Shikiar et al.,34 Finlay et al.,33 Melilli et al.,47 Ali et al.48

Dermatology Life Quality Index

The DLQI is a widely used dermatology-specific HRQoL instrument which assesses the impact of skin disease.33 It is a 10-item questionnaire that covers six domains over a one-week recall period: symptoms and feeling, daily activities, leisure, work and school, personal relationships, and treatment. Each item is scored on a four-point Likert scale: 0, (not at all affected/not relevant), 1 (a little affected), 2 (a lot affected), and 3 (very much affected). The overall DLQI score is a numeric score between 0 to 30, with lower scores indicating better HRQoL. At least 80% of the questions must be answered for a score to be reported.33,34 The final numeric score translates to the effect of the patient’s disease on their quality of life where 0 to 1 equals no effect, 2 to 5 equals small effect, 6 to 10 equals moderate effect, 11 to 20 equals very large effect, and 21 to 30 equals extremely large effect. The DLQI can be completed within a few minutes, making it a very time-efficient scoring system for use in clinical settings,53 although the clinical expert consulted by CADTH for this review stated that in clinical practice, HRQoL is assessed through a discussion with the patient.

Validity

The DLQI was developed in 1994, and since has been validated in many studies.33,34,38,5355 Construct validity of the DLQI was based on the correlation of the instrument with either generic, dermatologic, or disease-specific instruments in more than 37 separate studies.38 Shikiar et al. reported a good correlation (correlation coefficient [r] > 0.61) with three different itch measures in a study combining results from trials in moderate-to-severe plaque psoriasis (N = 1,095).54 A later study by Shikiar et al. demonstrated excellent correlation between the DLQI and generic HRQoL instruments in a population of 147 patients with moderate-to-severe plaque psoriasis randomized to adalimumab versus placebo; the DLQI correlated the greatest with the bodily pain (r = 0.61) and social functioning domains (r = 0.68) of the Short Form (36) Health Survey, as well as the overall EuroQol 5-Dimensions questionnaire index score (r = 0.71).34

Reliability

In the original validation study by Finlay and Khan, the reliability of the DLQI was assessed with 53 patients with a variety of skin diseases by completing the questionnaire twice, 7 to 10 days apart.33 The test-retest reliability correlation coefficients were obtained using the Spearman rank correlation test, which were high for both the overall score (0.99) and individual questions (0.95 to 0.98).33 The good test-retest reliability of the DLQI was also confirmed in a systematic review by Basra et al., with eight of 12 international studies reporting correlation coefficients greater than 0.56 and up to 0.99.38 The same review reported good internal consistency reliability of the DLQI which is based on 22 international studies with Cronbach alpha coefficients ranging from 0.75 to 0.92.38

Responsiveness

Responsiveness to change in the clinical status of a patient was measured by comparing DLQI data with PASI and PGA scores.34 The correlations between the DLQI and the two disease severity scores were r = 0.69 and r = 0.71, respectively. The DLQI demonstrated equal responsiveness to the PASI and PGA scores with correlation coefficients of r = 0.69 and r = 0.71, which was not achieved by the general tools, the EuroQol 5-Dimensions questionnaire (r = 0.44) and Short Form (36) Health Survey (r = 0.44).34 In a second study assessing responsiveness, Shikiar et al. contrasted change in DLQI scores in patients who were defined as clinical responders (achievement of PASI 75 response by week 12) with those characterized as nonresponders (< PASI 50). DLQI scores in responders improved by 12.17 points, compared with 1.77 points in the nonresponders subgroup. The difference was statistically significant (t = 9.0, effect size = 0.40, P < 0.0001).34 Additional studies demonstrating the responsiveness of the DLQI were also identified in the systematic review by Basra et al.38,55

MID

Shikiar et al. estimated the MID of the DLQI in patients with moderate-to-severe plaque psoriasis (N = 147) using three anchor-based methods; MID-1 was based on scores from near-responders (PASI improvement of 25% to 49%), MID-2 was based on partial responders (PASI improvement 50% to 74%), and MID-3 corresponded to the difference between nonresponders and minimal responders for the PGA score. The authors also estimated the MID using one-half of the SD of baseline scores.34 Estimates ranged from 2.2 to 6.9.34 It should be noted that these approaches lack patient-based anchors and therefore do not necessarily identify the minimal difference that a patient would consider important. Another study in patients with moderate-to-severe plaque psoriasis (N = 147) treated with adalimumab reported an MID of 3.2.47 In the most recent systematic review of RCTs in patients with chronic psoriasis, the DLQI MID was reported to be a score change of five.48

Limitations

The DLQI was the first dermatology-specific tool to evaluate skin-related HRQoL and was originally developed for use in routine practice.33 While the tool focuses on the patient’s daily functioning, it has been criticized for not fully capturing emotional and mental states.56 Therefore, the DLQI may lack conceptual validity in the psychological consequences of living with psoriasis.

Investigator’s Global Assessment

The IGA is a subjective measurement of the clinical signs of psoriasis, where psoriatic lesions are graded for erythema, induration, and scaling. Various IGAs have been used in psoriasis with different descriptions and scores, with the most common IGA versions using five- to six-point scales.57,58 There are two types of IGAs, a static form which measures the investigator’s measurement of the disease at a given point in time, and a dynamic form in which the investigator evaluates the level of improvement or deterioration from a baseline.57,59 The static form of the IGA is preferred over the dynamic form given that it does not rely on the investigator’s recall of the patient’s disease severity observed at baseline or a previous visit. In the two studies under review, a five-point, static version of the IGA was used.10,11 To generate the IGA score, psoriatic lesions are graded for erythema, induration, and scaling based on a scale of 0 to 4 (Table 56) that is then averaged across all lesions to obtain a single estimate of the patient’s overall severity of disease at a given point in time. The three items are given equal weighting. The sum of the three scales is determined and then divided by three for a final IGA score from 0 to 4 (Table 56).

Table 56Investigator’s Global Assessment Scale

GradeScoreErythemaIndurationScaling
Clear0No evidence of erythemaNo evidence of plaque elevation above normal skin levelNo evidence of scaling
Almost clear1Faint pink or light red erythema on most plaquesSlight or barely perceptible elevation of plaques above normal skin levelSome plaques with fine scales
Mild2Most to all plaques are pink/light red in colourSome plaques have definite elevation above normal skin level, typically with edges that are indistinct and sloped on some of the plaquesMost to all plaques have some fine scales but are not fully covered; some plaques are completely covered with fine scale
Moderate3Most to all plaques are bright red, some plaques may be dark red in colourDefinite elevation of most to all plaques, rounded or sloped edges on most of the plaquesSome plaques are at least partially covered with a coarse scale, most to all plaques are nearly covered with fine or coarse scale
Severe4Most or all plaques are bright, dark, or dusky redAlmost all plaques are raised and well-demarcated; sharp edges on virtually all plaquesMost to all plaques are covered with coarse, thick scales

Source: Clinical Study Reports for Study 201,41 Study 301,10 Study 302,11 and Study 303.40

Although the five-point static IGA scale was used in the pivotal trials to evaluate treatment efficacy, there are no studies evaluating validity, reliability, or responsiveness of this five-point scale. However, the six-point IGA scale has been evaluated in terms of its validity, reliability, responsiveness to change, and MID. Because this scale is used to determine treatment efficacy, a primary outcome of the pivotal trials, we have summarized the literature in the following pertaining to this six-point scale. However, it should be noted that any conclusions about validity of outcomes are limited due to the use of a different scale.

The only difference between the five-point and the six-point IGA scale is the inclusion of a very severe category in the six-point IGA scale. In this category, the patient is given a score of 5 and is described as having “very severe thickening with hard edges; dark deep red coloration; very severe/very coarse scaling covering all lesions.”60 However, there are very few patients which fall into the highest category of very severe in previous studies that used the six-point IGA scale.60

The PGA denotes scales used by clinicians, whereas the IGA is used by investigators in clinical trials.60 The IGA and the PGA scales are the same scales, with the only difference being the use of IGA by investigators in clinical trials specifically, and the more commonplace use of PGA by clinicians.60 Oftentimes the IGA and PGA scales are used interchangeably in the literature validating the scales.60 There is an abundance of evidence validating the six-point PGA scale, and limited high-quality evidence validating the six-point IGA scale. Since use of the five-point IGA scale is important in determining the primary outcomes of the pivotal trials, CADTH has summarized the data available for the six-point scale PGA scale which was gathered in a clinical trial setting.

Validity

The most recent study assessing the validity of the PGA evaluated data from four phase III clinical studies of tofacitinib in patients with psoriasis (N = 3,641).61 Confirmatory factor analysis used to test the fit of the PGA measurement model demonstrated that equal weighting of the three items (erythema, induration, and scaling) was appropriate, as indicated by Bentler’s Comparative Fit Index values greater than 0.98 (acceptable fit defined as > 0.9) and standardized path coefficients all above the threshold of 0.4. Construct validity was assessed using a known-group approach, measuring the relationship between PGA and PASI through a repeated measures model. A positive relationship between the PGA and PASI scores was observed which was stable and replicable across the four studies, indicating that the PGA could discriminate between different degrees of disease severity.61

Simpson et al. evaluated the construct and content validity of the PGA by its association with the DLQI.62 The correlation between PGA and DLQI was moderately positive (r = 0.29 to 0.43) at post-therapy time points. As with the PASI instrument, the authors found the scaling score to be minimally and inconsistently associated with DLQI score, while erythema and induration were positively correlated with the DLQI score. In contrast to Callis Duffin et al.61, Simpson and colleagues concluded that the equal weighing of the three items would not accurately capture the varying degrees to which these factors affect the patient’s rating of quality of life.62

Convergent and divergent validity were assessed by determining the correlation of the PGA with three additional outcome measures: the PASI, patient global assessment, and DLQI.61 Pearson correlation coefficients between PGA and the three scales ranged from 0.4 to 0.79, with the strongest correlation found with PASI. These findings were consistent with a previous psychometric validation study of the PGA in a single phase III trial by Cappelleri et al.63 and in several other studies.51,58,64,65

Reliability

Callis Duffin et al. evaluated consistency of PGA measurements between screening and baseline visits, when no change in terms of disease severity was expected.61 The intraclass correlation coefficient (ICC) value for the pooled data was 0.70, suggesting an acceptable test-retest reliability over a stable period. The same study assessed internal consistency reliability demonstrating that the scoring items (erythema, induration, and scaling) were highly consistent with each other (Cronbach coefficient alpha ≥ 0.90) at the primary assessment points in all four trials. The internal consistency reliability was less convincing (Cronbach coefficient alpha 0.50 to 0.63) for the values observed at baseline, likely a result of the specific inclusion criteria of the trials.61

Responsiveness

No evidence regarding the responsiveness of the five-point or six-point IGA or PGA was identified from the literature at this time.

Clinical Relevance

There are no studies evaluating the MID of the PGA at this time. However, it is generally accepted that a clinically meaningful score in the PGA is a score of clear or minimal.12 Furthermore, some trials define efficacy as a two-point reduction in the total PGA score.59 The two trials under review, Study 301 and Study 302, have defined a score of clear or almost clear (score of 0 or 1, respectively) with a minimum of a two-point difference as a clinically important threshold.

Strengths and Limitations

The PGA has been shown to be reliable based on test-retest data and internal consistency, however inter-rater reliability due to variability, especially in untrained observers, can be poor.12 Within a study, however, the PGA correlated well with the PASI and HRQoL measures.66 Furthermore, a systematic review by Robinson et al. including 30 RCTs of biologic drugs in psoriasis from 2001 to 2010 found that the PGA (scores of 0 or 1) correlated very tightly with the PASI 75 (r = 0.9157).59 Furthermore, given that the PGA has many different scales and scoring variations, comparisons between studies is made very difficult.12

Moreover, the PGA’s inability to measure the extent of psoriasis (i.e., amount of BSA affected), inability to discriminate small changes in severity, and lack of consideration for nonskin symptoms are further limitations.60 In addition, no MID has been established for psoriasis at this time.

Psoriasis Signs

The signs of psoriasis (erythema, plaque elevation, and scaling) were assessed for the selected target lesion using a subjective scale outlined in Table 57. According to the sponsor, the results from the psoriasis signs scale permits detection of changes specific to patient’s selected target lesion. The sponsor states that many similar scales are widely used in the therapeutic area of psoriasis as erythema, plaque elevation, and scaling are recognized as being basic characteristics of psoriasis lesions.

Table 57Assessment of Psoriasis Signs Scale

ScoreGradeDescription
Erythema
0NoneNo erythema
1MinimumPink discoloration, minimal erythema
2MildMost or all plaques are light red to red in colour
3ModerateMost or all plaques are bright red or dark in colour
4SevereMost plaques are dusky red with purple hue
Plaque elevation
0NoneNo evidence elevation above the normal skin level
1MinimumSlight, just discernible elevation above normal skin level
2MildSome plaques show definite elevation with indistinct edges
3ModerateMost plaques have definite elevation with distinct edges that are rounded or sloped
4SevereAlmost all plaques are raised above normal skin level with sharp edges
Scaling
0NoneNo scales on very few plaques
1MinimumOccasional fine scales hardly noticeable
2MildMost plaques have fine scales
3ModerateSome plaques have coarse scales while most plaques have fine scales
4SevereMost plaques are covered by thick coarse scales

Source: Clinical Study Reports for Study 201,41 Study 301,10 and Study 302.11

Currently, there was no information identified in the independent literature search conducted by CADTH pertaining to this scale on the construct of the grading for this assessment, including evidence on its validity, reliability, responsiveness to change, or clinical relevance. Although there was no MID identified in the literature, it should be noted that the clinical expert consulted in this review indicated that a decrease of two points in any of the signs of psoriasis would be considered a clinically important outcome for patients.

Limitations

Limitations for use of this scale are the lack of evidence on its validity, reliability, responsiveness to change, and clinical relevance.

Body Surface Area

BSA affected by psoriasis is used to determine extent of psoriasis coverage within a patient. BSA was calculated in the pivotal trials with the 1% rule. This estimation uses a flat palm in which the subject’s palm represents approximately 1% of the total BSA. The subject or investigators then use their flat palm to estimate the percentage BSA affected by psoriasis.35 The BSA calculation in the pivotal trials did not include areas of the face, scalp, palms, soles, axillae, and other intertriginous areas.10,11 It is generally accepted that if a patient presents with a BSA affected of 0% to 3% or less is considered low BSA affected, 3% to 10% or less is considered medium BSA affected, and BSA affected of greater than 10% is considered a high amount of BSA involvement.36

Validity

Evaluation of validity for BSA as an outcome is not relevant, since BSA is not performed to measure disease severity but instead is used as a quantitative measure of BSA covered by psoriasis.

Reliability

The reliability of evaluating BSA affected by psoriasis has been assessed in several studies.36,4952,67 Inter-rater reliability was evaluated in two separate studies. The first was Reolid et al., who found high inter-rater reliability with an ICC of 0.91 when 56 patients’ BSA was evaluated by two dermatologists.49 Moderate agreement in inter-rater reliability was found when percentage BSA affected by psoriasis was evaluated for 20 patients by 19 different doctors, with an ICC of 0.47. When these data were analyzed excluding rheumatologists (which are less trained to determine BSA), the inter-rater reliability ICC for dermatologists only was 0.96.50

Test-retest reliability for use of the 1% rule to determine BSA affected by psoriasis was evaluated in two separate studies. In the first, Reolid et al. demonstrated high test-retest reliability for BSA determination when 56 patients were evaluated, two days apart with an ICC of 0.98.49 Second, Bozek et al. found very good test-retest reliability of BSA with an ICC of 0.96.52

A systematic review conducted by Puzenat et al. found that the BSA displayed an acceptable amount of intra-rater variability (coefficient of variation < 10%), however the inter-rater variability was high (coefficient of variation > 30%).51 High inter-rater variability (coefficient of variation = 57.1) was also found in a study conducted by Bozek et al.52

Clinical Relevance

There is no literature pertaining to the MID of BSA affected by psoriasis. The clinical expert consulted in this review indicated that a one-third reduction in BSA affected by psoriasis after treatment is a clinically important difference for patients.

Limitations

Limitations of use of the 1% rule to estimate BSA affected by psoriasis are the high inter-rater variability, data showing that the 1% rule is inaccurate, lack of MID data, and the lack of ability to correlate BSA with disease severity. The high inter-rater variability can be explained by data from a meta-analysis performed by Rhodes et al., which found that the palm estimates 0.9% BSA in adult men and 0.85% BSA in adult women. Moreover, they found that body mass index and ethnic origin influence these values. 68

The clinical expert consulted by CADTH for this review informed that it is inaccurate to use BSA as the sole way to determine disease severity. This is due to the fact that if the location of psoriasis significantly affects HRQoL (i.e., centre of face, soles of feet so patient cannot walk), a patient may be classified as a more severe psoriasis patient even if total BSA affected is very low. Moreover, if a patient’s psoriatic lesions improve by reducing thickness, redness, or scaling this may not necessarily be reflected with a change in BSA.

Copyright © 2020 Canadian Agency for Drugs and Technologies in Health.

The copyright and other intellectual property rights in this document are owned by CADTH and its licensors. These rights are protected by the Canadian Copyright Act and other national and international laws and agreements. Users are permitted to make copies of this document for non-commercial purposes only, provided it is not modified when reproduced and appropriate credit is given to CADTH and its licensors.

Except where otherwise noted, this work is distributed under the terms of a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International licence (CC BY-NC-ND), a copy of which is available at http://creativecommons.org/licenses/by-nc-nd/4.0/

Bookshelf ID: NBK567523

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (2.3M)

In this Page

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...