The validity, reliability, and responsiveness of each outcome measure were summarized and evaluated. Interpretation of the reliability and validity metrics were based on the following criteria:
Dermatology Life Quality Index
The DLQI is a widely used dermatology-specific HRQoL instrument which assesses the impact of skin disease.33 It is a 10-item questionnaire that covers six domains over a one-week recall period: symptoms and feeling, daily activities, leisure, work and school, personal relationships, and treatment. Each item is scored on a four-point Likert scale: 0, (not at all affected/not relevant), 1 (a little affected), 2 (a lot affected), and 3 (very much affected). The overall DLQI score is a numeric score between 0 to 30, with lower scores indicating better HRQoL. At least 80% of the questions must be answered for a score to be reported.33,34 The final numeric score translates to the effect of the patient’s disease on their quality of life where 0 to 1 equals no effect, 2 to 5 equals small effect, 6 to 10 equals moderate effect, 11 to 20 equals very large effect, and 21 to 30 equals extremely large effect. The DLQI can be completed within a few minutes, making it a very time-efficient scoring system for use in clinical settings,53 although the clinical expert consulted by CADTH for this review stated that in clinical practice, HRQoL is assessed through a discussion with the patient.
Validity
The DLQI was developed in 1994, and since has been validated in many studies.33,34,38,53–55 Construct validity of the DLQI was based on the correlation of the instrument with either generic, dermatologic, or disease-specific instruments in more than 37 separate studies.38 Shikiar et al. reported a good correlation (correlation coefficient [r] > 0.61) with three different itch measures in a study combining results from trials in moderate-to-severe plaque psoriasis (N = 1,095).54 A later study by Shikiar et al. demonstrated excellent correlation between the DLQI and generic HRQoL instruments in a population of 147 patients with moderate-to-severe plaque psoriasis randomized to adalimumab versus placebo; the DLQI correlated the greatest with the bodily pain (r = 0.61) and social functioning domains (r = 0.68) of the Short Form (36) Health Survey, as well as the overall EuroQol 5-Dimensions questionnaire index score (r = 0.71).34
Reliability
In the original validation study by Finlay and Khan, the reliability of the DLQI was assessed with 53 patients with a variety of skin diseases by completing the questionnaire twice, 7 to 10 days apart.33 The test-retest reliability correlation coefficients were obtained using the Spearman rank correlation test, which were high for both the overall score (0.99) and individual questions (0.95 to 0.98).33 The good test-retest reliability of the DLQI was also confirmed in a systematic review by Basra et al., with eight of 12 international studies reporting correlation coefficients greater than 0.56 and up to 0.99.38 The same review reported good internal consistency reliability of the DLQI which is based on 22 international studies with Cronbach alpha coefficients ranging from 0.75 to 0.92.38
Responsiveness
Responsiveness to change in the clinical status of a patient was measured by comparing DLQI data with PASI and PGA scores.34 The correlations between the DLQI and the two disease severity scores were r = 0.69 and r = 0.71, respectively. The DLQI demonstrated equal responsiveness to the PASI and PGA scores with correlation coefficients of r = 0.69 and r = 0.71, which was not achieved by the general tools, the EuroQol 5-Dimensions questionnaire (r = 0.44) and Short Form (36) Health Survey (r = 0.44).34 In a second study assessing responsiveness, Shikiar et al. contrasted change in DLQI scores in patients who were defined as clinical responders (achievement of PASI 75 response by week 12) with those characterized as nonresponders (< PASI 50). DLQI scores in responders improved by 12.17 points, compared with 1.77 points in the nonresponders subgroup. The difference was statistically significant (t = 9.0, effect size = 0.40, P < 0.0001).34 Additional studies demonstrating the responsiveness of the DLQI were also identified in the systematic review by Basra et al.38,55
MID
Shikiar et al. estimated the MID of the DLQI in patients with moderate-to-severe plaque psoriasis (N = 147) using three anchor-based methods; MID-1 was based on scores from near-responders (PASI improvement of 25% to 49%), MID-2 was based on partial responders (PASI improvement 50% to 74%), and MID-3 corresponded to the difference between nonresponders and minimal responders for the PGA score. The authors also estimated the MID using one-half of the SD of baseline scores.34 Estimates ranged from 2.2 to 6.9.34 It should be noted that these approaches lack patient-based anchors and therefore do not necessarily identify the minimal difference that a patient would consider important. Another study in patients with moderate-to-severe plaque psoriasis (N = 147) treated with adalimumab reported an MID of 3.2.47 In the most recent systematic review of RCTs in patients with chronic psoriasis, the DLQI MID was reported to be a score change of five.48
Limitations
The DLQI was the first dermatology-specific tool to evaluate skin-related HRQoL and was originally developed for use in routine practice.33 While the tool focuses on the patient’s daily functioning, it has been criticized for not fully capturing emotional and mental states.56 Therefore, the DLQI may lack conceptual validity in the psychological consequences of living with psoriasis.
Investigator’s Global Assessment
The IGA is a subjective measurement of the clinical signs of psoriasis, where psoriatic lesions are graded for erythema, induration, and scaling. Various IGAs have been used in psoriasis with different descriptions and scores, with the most common IGA versions using five- to six-point scales.57,58 There are two types of IGAs, a static form which measures the investigator’s measurement of the disease at a given point in time, and a dynamic form in which the investigator evaluates the level of improvement or deterioration from a baseline.57,59 The static form of the IGA is preferred over the dynamic form given that it does not rely on the investigator’s recall of the patient’s disease severity observed at baseline or a previous visit. In the two studies under review, a five-point, static version of the IGA was used.10,11 To generate the IGA score, psoriatic lesions are graded for erythema, induration, and scaling based on a scale of 0 to 4 () that is then averaged across all lesions to obtain a single estimate of the patient’s overall severity of disease at a given point in time. The three items are given equal weighting. The sum of the three scales is determined and then divided by three for a final IGA score from 0 to 4 ().
Table 56Investigator’s Global Assessment Scale
View in own window
Grade | Score | Erythema | Induration | Scaling |
---|
Clear | 0 | No evidence of erythema | No evidence of plaque elevation above normal skin level | No evidence of scaling |
Almost clear | 1 | Faint pink or light red erythema on most plaques | Slight or barely perceptible elevation of plaques above normal skin level | Some plaques with fine scales |
Mild | 2 | Most to all plaques are pink/light red in colour | Some plaques have definite elevation above normal skin level, typically with edges that are indistinct and sloped on some of the plaques | Most to all plaques have some fine scales but are not fully covered; some plaques are completely covered with fine scale |
Moderate | 3 | Most to all plaques are bright red, some plaques may be dark red in colour | Definite elevation of most to all plaques, rounded or sloped edges on most of the plaques | Some plaques are at least partially covered with a coarse scale, most to all plaques are nearly covered with fine or coarse scale |
Severe | 4 | Most or all plaques are bright, dark, or dusky red | Almost all plaques are raised and well-demarcated; sharp edges on virtually all plaques | Most to all plaques are covered with coarse, thick scales |
Source: Clinical Study Reports for Study 201,41 Study 301,10 Study 302,11 and Study 303.40
Although the five-point static IGA scale was used in the pivotal trials to evaluate treatment efficacy, there are no studies evaluating validity, reliability, or responsiveness of this five-point scale. However, the six-point IGA scale has been evaluated in terms of its validity, reliability, responsiveness to change, and MID. Because this scale is used to determine treatment efficacy, a primary outcome of the pivotal trials, we have summarized the literature in the following pertaining to this six-point scale. However, it should be noted that any conclusions about validity of outcomes are limited due to the use of a different scale.
The only difference between the five-point and the six-point IGA scale is the inclusion of a very severe category in the six-point IGA scale. In this category, the patient is given a score of 5 and is described as having “very severe thickening with hard edges; dark deep red coloration; very severe/very coarse scaling covering all lesions.”60 However, there are very few patients which fall into the highest category of very severe in previous studies that used the six-point IGA scale.60
The PGA denotes scales used by clinicians, whereas the IGA is used by investigators in clinical trials.60 The IGA and the PGA scales are the same scales, with the only difference being the use of IGA by investigators in clinical trials specifically, and the more commonplace use of PGA by clinicians.60 Oftentimes the IGA and PGA scales are used interchangeably in the literature validating the scales.60 There is an abundance of evidence validating the six-point PGA scale, and limited high-quality evidence validating the six-point IGA scale. Since use of the five-point IGA scale is important in determining the primary outcomes of the pivotal trials, CADTH has summarized the data available for the six-point scale PGA scale which was gathered in a clinical trial setting.
Validity
The most recent study assessing the validity of the PGA evaluated data from four phase III clinical studies of tofacitinib in patients with psoriasis (N = 3,641).61 Confirmatory factor analysis used to test the fit of the PGA measurement model demonstrated that equal weighting of the three items (erythema, induration, and scaling) was appropriate, as indicated by Bentler’s Comparative Fit Index values greater than 0.98 (acceptable fit defined as > 0.9) and standardized path coefficients all above the threshold of 0.4. Construct validity was assessed using a known-group approach, measuring the relationship between PGA and PASI through a repeated measures model. A positive relationship between the PGA and PASI scores was observed which was stable and replicable across the four studies, indicating that the PGA could discriminate between different degrees of disease severity.61
Simpson et al. evaluated the construct and content validity of the PGA by its association with the DLQI.62 The correlation between PGA and DLQI was moderately positive (r = 0.29 to 0.43) at post-therapy time points. As with the PASI instrument, the authors found the scaling score to be minimally and inconsistently associated with DLQI score, while erythema and induration were positively correlated with the DLQI score. In contrast to Callis Duffin et al.61, Simpson and colleagues concluded that the equal weighing of the three items would not accurately capture the varying degrees to which these factors affect the patient’s rating of quality of life.62
Convergent and divergent validity were assessed by determining the correlation of the PGA with three additional outcome measures: the PASI, patient global assessment, and DLQI.61 Pearson correlation coefficients between PGA and the three scales ranged from 0.4 to 0.79, with the strongest correlation found with PASI. These findings were consistent with a previous psychometric validation study of the PGA in a single phase III trial by Cappelleri et al.63 and in several other studies.51,58,64,65
Reliability
Callis Duffin et al. evaluated consistency of PGA measurements between screening and baseline visits, when no change in terms of disease severity was expected.61 The intraclass correlation coefficient (ICC) value for the pooled data was 0.70, suggesting an acceptable test-retest reliability over a stable period. The same study assessed internal consistency reliability demonstrating that the scoring items (erythema, induration, and scaling) were highly consistent with each other (Cronbach coefficient alpha ≥ 0.90) at the primary assessment points in all four trials. The internal consistency reliability was less convincing (Cronbach coefficient alpha 0.50 to 0.63) for the values observed at baseline, likely a result of the specific inclusion criteria of the trials.61
Responsiveness
No evidence regarding the responsiveness of the five-point or six-point IGA or PGA was identified from the literature at this time.
Clinical Relevance
There are no studies evaluating the MID of the PGA at this time. However, it is generally accepted that a clinically meaningful score in the PGA is a score of clear or minimal.12 Furthermore, some trials define efficacy as a two-point reduction in the total PGA score.59 The two trials under review, Study 301 and Study 302, have defined a score of clear or almost clear (score of 0 or 1, respectively) with a minimum of a two-point difference as a clinically important threshold.
Strengths and Limitations
The PGA has been shown to be reliable based on test-retest data and internal consistency, however inter-rater reliability due to variability, especially in untrained observers, can be poor.12 Within a study, however, the PGA correlated well with the PASI and HRQoL measures.66 Furthermore, a systematic review by Robinson et al. including 30 RCTs of biologic drugs in psoriasis from 2001 to 2010 found that the PGA (scores of 0 or 1) correlated very tightly with the PASI 75 (r = 0.9157).59 Furthermore, given that the PGA has many different scales and scoring variations, comparisons between studies is made very difficult.12
Moreover, the PGA’s inability to measure the extent of psoriasis (i.e., amount of BSA affected), inability to discriminate small changes in severity, and lack of consideration for nonskin symptoms are further limitations.60 In addition, no MID has been established for psoriasis at this time.
Psoriasis Signs
The signs of psoriasis (erythema, plaque elevation, and scaling) were assessed for the selected target lesion using a subjective scale outlined in . According to the sponsor, the results from the psoriasis signs scale permits detection of changes specific to patient’s selected target lesion. The sponsor states that many similar scales are widely used in the therapeutic area of psoriasis as erythema, plaque elevation, and scaling are recognized as being basic characteristics of psoriasis lesions.
Table 57Assessment of Psoriasis Signs Scale
View in own window
Score | Grade | Description |
---|
Erythema |
---|
0 | None | No erythema |
1 | Minimum | Pink discoloration, minimal erythema |
2 | Mild | Most or all plaques are light red to red in colour |
3 | Moderate | Most or all plaques are bright red or dark in colour |
4 | Severe | Most plaques are dusky red with purple hue |
Plaque elevation |
---|
0 | None | No evidence elevation above the normal skin level |
1 | Minimum | Slight, just discernible elevation above normal skin level |
2 | Mild | Some plaques show definite elevation with indistinct edges |
3 | Moderate | Most plaques have definite elevation with distinct edges that are rounded or sloped |
4 | Severe | Almost all plaques are raised above normal skin level with sharp edges |
Scaling |
---|
0 | None | No scales on very few plaques |
1 | Minimum | Occasional fine scales hardly noticeable |
2 | Mild | Most plaques have fine scales |
3 | Moderate | Some plaques have coarse scales while most plaques have fine scales |
4 | Severe | Most plaques are covered by thick coarse scales |
Source: Clinical Study Reports for Study 201,41 Study 301,10 and Study 302.11
Currently, there was no information identified in the independent literature search conducted by CADTH pertaining to this scale on the construct of the grading for this assessment, including evidence on its validity, reliability, responsiveness to change, or clinical relevance. Although there was no MID identified in the literature, it should be noted that the clinical expert consulted in this review indicated that a decrease of two points in any of the signs of psoriasis would be considered a clinically important outcome for patients.
Limitations
Limitations for use of this scale are the lack of evidence on its validity, reliability, responsiveness to change, and clinical relevance.
Body Surface Area
BSA affected by psoriasis is used to determine extent of psoriasis coverage within a patient. BSA was calculated in the pivotal trials with the 1% rule. This estimation uses a flat palm in which the subject’s palm represents approximately 1% of the total BSA. The subject or investigators then use their flat palm to estimate the percentage BSA affected by psoriasis.35 The BSA calculation in the pivotal trials did not include areas of the face, scalp, palms, soles, axillae, and other intertriginous areas.10,11 It is generally accepted that if a patient presents with a BSA affected of 0% to 3% or less is considered low BSA affected, 3% to 10% or less is considered medium BSA affected, and BSA affected of greater than 10% is considered a high amount of BSA involvement.36
Validity
Evaluation of validity for BSA as an outcome is not relevant, since BSA is not performed to measure disease severity but instead is used as a quantitative measure of BSA covered by psoriasis.
Reliability
The reliability of evaluating BSA affected by psoriasis has been assessed in several studies.36,49–52,67 Inter-rater reliability was evaluated in two separate studies. The first was Reolid et al., who found high inter-rater reliability with an ICC of 0.91 when 56 patients’ BSA was evaluated by two dermatologists.49 Moderate agreement in inter-rater reliability was found when percentage BSA affected by psoriasis was evaluated for 20 patients by 19 different doctors, with an ICC of 0.47. When these data were analyzed excluding rheumatologists (which are less trained to determine BSA), the inter-rater reliability ICC for dermatologists only was 0.96.50
Test-retest reliability for use of the 1% rule to determine BSA affected by psoriasis was evaluated in two separate studies. In the first, Reolid et al. demonstrated high test-retest reliability for BSA determination when 56 patients were evaluated, two days apart with an ICC of 0.98.49 Second, Bozek et al. found very good test-retest reliability of BSA with an ICC of 0.96.52
A systematic review conducted by Puzenat et al. found that the BSA displayed an acceptable amount of intra-rater variability (coefficient of variation < 10%), however the inter-rater variability was high (coefficient of variation > 30%).51 High inter-rater variability (coefficient of variation = 57.1) was also found in a study conducted by Bozek et al.52
Clinical Relevance
There is no literature pertaining to the MID of BSA affected by psoriasis. The clinical expert consulted in this review indicated that a one-third reduction in BSA affected by psoriasis after treatment is a clinically important difference for patients.
Limitations
Limitations of use of the 1% rule to estimate BSA affected by psoriasis are the high inter-rater variability, data showing that the 1% rule is inaccurate, lack of MID data, and the lack of ability to correlate BSA with disease severity. The high inter-rater variability can be explained by data from a meta-analysis performed by Rhodes et al., which found that the palm estimates 0.9% BSA in adult men and 0.85% BSA in adult women. Moreover, they found that body mass index and ethnic origin influence these values. 68
The clinical expert consulted by CADTH for this review informed that it is inaccurate to use BSA as the sole way to determine disease severity. This is due to the fact that if the location of psoriasis significantly affects HRQoL (i.e., centre of face, soles of feet so patient cannot walk), a patient may be classified as a more severe psoriasis patient even if total BSA affected is very low. Moreover, if a patient’s psoriatic lesions improve by reducing thickness, redness, or scaling this may not necessarily be reflected with a change in BSA.