Assessment

National Clinical Guideline Centre (UK)

6Assessment

When a new drug is started a patient may experience adverse symptoms for a variety of reasons. These may be related to the underlying disorder for which the patient was being treated, may be incidental and unrelated to the drug or disease, or they may be caused by the drug itself.

In cases of known non-immunologically mediated adverse reaction, for example, nausea or abdominal discomfort, the decision on whether to continue will be taken after discussion with the patient, and assessment of the severity of the reaction and the length of the remaining prescription course will be taken into account. If the patient has suffered a hypersensitivity reaction, however, the drug will almost invariably be stopped and if necessary an alternative drug sought. There can be considerable overlap between symptoms recognised from the adverse reaction profile of the drug and those resulting from hypersensitivity reaction. Each drug has a specific pattern of expected non-allergic symptoms and even immunologically mediated symptoms can follow a familiar pattern seen in previous patients. A correct diagnosis differentiating an allergic from a non-allergic reaction at the time of presentation should therefore allow safe future prescription and avoidance of the specific drug or drug class. Detailed documentation of the adverse reaction will also allow a more accurate specialist assessment if the patient requires the same or similar drug in future.

6.1. Review question: What is the clinical and cost effectiveness of clinical probability scores or algorithms in identifying or excluding drug allergies?

For full details see review protocol in Appendix C.

Table 6

Characteristics of review question.

6.2. Clinical evidence

6.2.1. Algorithms

We searched the literature for systematic reviews or any other study design that aimed to identify a set of signs and symptoms, usually in the form of a questionnaire or checklist (that is, an algorithm) to ascertain whether a person has a drug allergy. One systematic review (Agbabiaka et al. 2008³) was identified, as were 7 additional algorithm studies: Bousquet et al. 2009,¹⁸ Caimmi et al. 2012,²³ Du et al. 2013,³⁸ Gallagher et al. 2011⁵² (also known as the Liverpool algorithm), Gonzalez et al. 1992⁵⁶ (which was missing from Agbagiaka's systematic review), Son et al. 2011¹⁵⁴ and Trewin et al. 1991¹⁶¹ (also missing from Agbagiaka's systematic review). Each of these studies describes the development of an algorithm in order to evaluate drug allergies. A further study was identified which updated 1 of the included algorithms (Arimone et al. 2013⁵ updating the French Begaud et al. 1985¹² algorithm). This is added to the reference in Table 7.

The systematic review of algorithms by Agbabiaka et al. 2008³ is considered to be at a moderate risk of bias according to the NICE systematic review checklist (since the quality of the included algorithms and probability scores was reported in a narrative manner and criteria for quality assessment were not explicitly described), but it considered algorithms for both adverse drug reactions and drug allergies. The authors included 26 algorithms in the systematic review. Six of these algorithms⁶⁴^,⁷⁷^,⁸⁰^,¹⁰⁶^,¹⁵⁸^,¹⁷⁴ were excluded from this review on the basis that they focused on adverse drug reactions (ADR) alone without the drug allergy being recorded as a subset of ADR.

The working definition of ‘algorithm’ from the identified systematic review was, “…a set of specific questions with associated scores for calculating the likelihood of a cause–effect relationship”. The authors extracted criteria in the assessment of adverse drug reactions for 26 algorithms and probability scores and these are shown in Table 7 below for each of the included algorithms. The 12 categories for assessment provide a starting point for this review but were not explained fully. Therefore it was necessary in some cases to impute the meaning of individual categories.

The following categories were used (with brief explanations of how we interpreted them):

Time to onset or temporal sequence.
Measurement of the time elapsed between taking medication and a reaction to develop.
Previous experience or information on drug.
A previous experience with the drug or a previous reaction to the drug.
Alternative aetiological candidates.
Ruling out other reasons for the reaction to the drug.
Drug level or evidence of overdose.
Whether the correct dose was used.
Challenge.
Assessment of what happens when the drug is introduced.
Dechallenge.
Assessment of what happens when the person is taken off the drug.
Rechallenge.
Assessment of what happens when the drug is reintroduced.
Response pattern to drug (symptoms).
This point was unclear in the systematic review. We interpreted it to mean the clinical manifestation of the signs and symptoms that would be specific to the drug under investigation.
Confirmed by laboratory evidence.
Whether laboratory tests have already been carried out.
Concomitant drugs.
Whether there could be a potential drug interaction.
Background epidemiological or clinical information.
For this category we focused on background epidemiology since clinical information was not clearly defined in the review.
Characteristics or mechanisms of adverse drug reaction.
How this reaction is related to the drug under investigation and whether the reaction is plausible in light of the drug's mechanisms.

We also searched the literature for systematic reviews or any other study design that aimed to identify a set of signs and symptoms in the form of a probability score to ascertain whether a person has a drug allergy. The systematic review by Agbabiaka et al.³ reviewed 4 probabilistic or Bayesian approaches to assessment of drug allergy.⁷⁰^,⁹²^,⁹⁴^,¹⁰⁹ One further study was identified (Theophile et al. 2013¹⁶⁰). This additional study also included a comparison with other algorithms.

Furthermore, Agbabiaka et al. 2008³ reviewed comparisons of algorithms. These are studies in which people with suspected drug allergies are assessed with more than 1 algorithm and the level of agreement (that is, congruency) between the assessments is then calculated. Table 10 summarises results of 6 comparative studies.¹³^,²⁰^,⁷⁶^,¹¹³^,¹³⁴^,¹⁶⁰ A further comparison study was added in the update of the systematic review.¹⁶⁰

Agbabiaka et al. 2008³ included a narrative analysis of 26 algorithms, but there was no explicit quality assessment of individual algorithms (they were appraised narratively). In the current review an explicit list of criteria was drawn up to assess the quality of the 6 additional algorithms that were identified from the search. In this checklist the quality of each of the following features was assessed (for these criteria please see section 3.3.6.4).

Using the format from Agbabiaka et al. 2008³ used in Table 7, the 12 criteria were extracted for each of the 7 additional studies¹⁸^,²³^,³⁸^,⁵²^,⁵⁶^,¹⁵⁴^,¹⁶¹ included in the current review.

Table 7 below reproduces an amended version of the summary that is provided in Agbabiaka et al. 2008.³ Studies which did not include drug allergy in the adverse drug reaction algorithms were excluded. Table 8 uses the same criteria to assess the additional algorithms identified in our search (with comments and quality assessment according to our checklist in the final 2 columns).

Table 11 summarises the frequency of the criteria across algorithms. Please also see the study selection flow chart in Appendix E, study evidence tables in Appendix H and exclusion list in Appendix K.

Table 7. Criteria to assess the association between a reaction and a drug: studies included in the systematic review (adapted from Agbabiaka et al. 2008)

Table 8. Criteria to assess the association between a reaction and a drug: studies not included in the systematic review (adapted from Agbabiaka et al. 2008 with additional notes and quality ratings in the final 2 columns)

6.2.2. Probability scores

Bayesian methods have been proposed to provide a formal inferential framework for causality in the assessment of drug allergy and adverse drug reactions. It is mathematically based upon calculating a ratio (the posterior odds) between 2 probabilities both of which are conditional on the same background and case information: that a given drug caused an adverse event versus that an alternative cause is responsible.

Despite the benefits of repeatability, transparency, explicitness, completeness, balancing of case data and no arbitrary limiting of information on the assessment, this method of causation analysis can be time consuming and may require significant use of resources and complex calculations.

The same categories were used as those described for the algorithms.

Agbabiaka et al. 2008³ included a narrative analysis of the probabilistic and Bayesian approaches, but there was no explicit quality assessment of individual algorithms.

Table 9 below is adapted from the summary that is provided in Agbabiaka et al.³

Table 9

Probabilistic or Bayesian approaches to causation used.

6.2.3. Comparative studies

The conclusion of the systematic review by Agbabiaka et al. 2008³ was that “…no single algorithm is accepted as the ‘gold standard,’ because of the shortcomings and disagreements that exist between them.” We have reviewed 6 studies¹³^,²⁰^,⁷⁶^,¹¹³^,¹³⁴^,¹⁶⁰ which compare the most commonly used algorithms for drug allergy and provide kappa statistics as a measure of congruency. A summary of the statistical conclusions of the comparative studies is provided in Table 10 below.

Table 10. Studies comparing algorithms

6.2.4. Most commonly used algorithm criteria

For the current review we used the Agbabiaka et al. 2008³ findings for 20 algorithms which included drug allergy as part of the evaluation of ADR, the 5 probabilistic or Bayesian studies in Agbabiaka et al. 2008,³ and the 7 additional algorithms added into this review, to assess how frequently different causality criteria appeared across all of the algorithms (see Table 11). The assessment criteria were ranked as follows:

Table 11

Frequency causality criteria were used across 32 algorithms and probability scores (25 from the systematic review and 7 added in the current review).

The evidence shows that none of the criteria are used consistently in all of the algorithms. This includes ‘time to onset or temporal sequence’ and ‘response pattern to drug (clinical response)’ which were only used as assessment criteria in 24 (75%) of the 32 algorithms. Questions about drug challenge and ADR characteristics or mechanisms featured least frequently across algorithms, only occurring in 10 and 8 of 32 algorithms (31% and 25%), respectively.

Agbabiaka et al. 2008³ also reviewed comparisons of algorithms which were updated here. These are studies in which people with suspected drug allergies are assessed with more than one algorithm and the level of agreement (that is, congruency) between the assessments is then calculated.

Congruencies showed the whole range from 0% to 100% agreement with no agreement between the Begaud and Kramer or Jones in one study and a 100% agreement between Kramer and Jones in the same study. Even the same comparisons sometimes had very different levels of agreement across comparisons (for example, comparisons of Kramer and Jones showed perfect agreement in one study and only moderate agreement, 67%, in another).

6.3. Economic evidence

Published literature

No relevant economic evaluations were identified.

See also the economic article selection flow chart in Appendix F.

6.4. Evidence statements

Clinical

Assessment criteria: moderate quality evidence from 32 algorithms and probability scores (according to quality of the included systematic review and the quality of the additional algorithms) indicated no clear criteria that were used consistently to assess whether a person has a drug allergy. The most frequently used criteria were ‘time to onset or temporal sequence’ and ‘response pattern to drug’.
Assessment comparisons: there were highly variable levels of agreement between algorithms ranging from no agreement (0%) to a perfect level of agreement (100%) with some inconsistencies in results for the same comparisons in different studies. In all comparisons the Naranjo algorithm was used as one of the comparators or the only reference standard. The second most frequent comparator was the Kramer algorithm.

Economic

No relevant economic evaluations were identified.

6.5. Recommendations and link to evidence

Table

When assessing a person presenting with possible drug allergy, take a history and undertake a clinical examination. Use the following boxes as a guide when deciding whether to suspect drug allergy. Boxes 1–3 Signs and allergic patterns of suspected (more...)

Publication Details

Copyright

Publisher

National Institute for Health and Care Excellence (NICE), London

NLM Citation

National Clinical Guideline Centre (UK). Drug Allergy: Diagnosis and Management of Drug Allergy in Adults, Children and Young People. London: National Institute for Health and Care Excellence (NICE); 2014 Sep. (NICE Clinical Guidelines, No. 183.) 6, Assessment.

Table 6Characteristics of review question

Population	Patients presenting with signs and symptoms of suspected drug allergy; patients with a record of a suspected drug allergy
Intervention	Clinical algorithms or prediction rules that assess likelihood or class patients by likelihood of having a drug allergy
Aim	To identify any signs and symptoms that are consistently used to assess the likelihood of a person having a drug allergy across algorithms currently in use
Study design	In the absence of RCTs, cohorts studies will be considered, particularly any multivariate studies used to derive the algorithms

Table 7Criteria to assess the association between a reaction and a drug: studies included in the systematic review (adapted from Agbabiaka et al. 2008³)

Author	TTO or temp seq	Prev exp or drug info	Alter aetiologies	Drug level or evidence of overdose	Challenge	Dechallenge	Rechallenge	Response pattern to drug	Confirmed lab evidence	Concomitant drugs	Background epi or clin info	ADR char or mech	Other
Begaud et al. 1985,¹² updated by Arimone et al. 2013⁵	✓	✗	✓	✓	✗	✓	✓	✓	✓	✓	✗	✓	✗
Benichou and Danan 1992¹⁴	✓	✗	✗	✗	✗	✗	✗	✓	✓	✗	✓	✗	✗
Blanc et al. 1979¹⁶	✓	✗	✗	✗	✓	✗	✗	✗	✓	✗	✗	✗	✗
Castle 1984²⁴	✓	✗	✓	✓	✗	✗	✗	✓	✗	✓	✓	✗	✗
Cornelli 1984³⁰	✓	✗	✗	✓	✗	✗	✓	✓	✗	✓	✗	✗	✗
Danan and Benichow 1993³¹	✓	✗	✓	✗	✗	✗	✗	✓	✓	✗	✓	✗	✗
Dangoumau et al. 1978³²	✗	✗	✓	✓	✗	✓	✓	✓	✗	✗	✗	✗	✗
Emanueli and Sacchetti 1980⁴⁰	✓	✓	✓	✗	✓	✗	✓	✓	✗	✗	✗	✗	✗
Evreux et al. 1982⁴⁵	✓	✓	✓	✗	✗	✓	✗	✗	✗	✗	✗	✗	✗
Hoskins and Mannino 1992⁶⁵	✓	✗	✗	✗	✗	✓	✓	✗	✗	✗	✓	✓	✗
Hsu and Stoll 1993⁶⁷	✗	✓	✗	✓	✗	✗	✓	✓	✗	✗	✗	✗	✗
Irey 1976⁷³	✓	✗	✗	✓	✗	✓	✓	✓	✓	✗	✗	✗	✗
Jones 1982⁷⁵	✗	✓	✗	✓	✗	✗	✓	✓	✗	✗	✗	✗	✗
Koh and Shu 2005⁸²	✓	✗	✓	✓	✓	✗	✓	✗	✗	✗	✗	✗	✗
Kramer et al. 1979⁸⁷	✓	✗	✓	✓	✓	✗	✓	✓	✗	✗	✗	✗	✗
Lagier et al. 1983⁹¹	✗	✗	✗	✓	✓	✓	✓	✓	✓	✗	✓	✓	✓
Naranjo et al. 1981¹²¹	✗	✓	✓	✗	✓	✗	✓	✓	✓	✗	✗	✗	✗
Stephens 1984¹⁵⁷	✗	✗	✓	✓	✗	✗	✓	✓	✓	✓	✗	✗	✗
Turner 1984¹⁶²	✓	✗	✗	✗	✓	✓	✗	✗	✗	✗	✓	✗	✗
Venulet et al. 1980¹⁶⁷	✗	✓	✗	✓	✗	✓	✓	✗	✗	✓	✗	✗	✓

Abbreviations:

TTO or temp seq: time to onset or temporal sequence
Prev exp or drug info: previous experience or information on drug
Alter aetiologies: alternative aetiological candidates (underlying illnesses, new illnesses, non-drug therapies and diagnostic tests and procedures)
Dechallenge: drug discontinued or reduced in dosage
Response pattern to drug: clinical manifestation – symptoms improve with treatment
Background epi or clin info: background epidemiological or clinical information
ADR char or mech: characteristics or mechanisms of adverse drug reaction
Other: other factors

Table 8Criteria to assess the association between a reaction and a drug: studies not included in the systematic review (adapted from Agbabiaka et al. 2008³ with additional notes and quality ratings in the final 2 columns)

Author, population, setting	TTO or temp seq	Prev exp or drug info	Alter aetiologies	Drug level or evidence of OD	Challenge	Dechallenge	Rechallenge	Response pattern to drug	Confirmed lab evidence	Concomitant drugs	Background epi or clin info	ADR char or mech	Other	Quality
Bousquet et al. 2009¹⁸ – ENDA classification Setting: ENDA (European Network for Drug Allergy) collaboration	✓	✓	✓	✗	✗	✗	✗	✓	✓	✓	✗	✓	Acute (up to 24 hours) versus delayed reactions (more than 24 hours)	High
Caimmi et al. 2012²³ Population & setting: Consecutive patients referred to Allergy Department University Hospital, Montpelier, France	✓	✓	✗	✗	✗	✗	✗	✓	✗	✗	✓	✓	Immediate (up to 6 hours) or non-immediate (more than 6 hours)	High

Du et al. 2013³⁸; Population: neonatal patients Setting: USA & Canada	✓	✗	✓	✓	✗	✓	✓	✓	✗	✓	✓	✗	Algorithm validated and performed better than Naranjo scale with Kappa and ICC scores of 0.76 and 0.62 compared to 0.31 and 0.43	Moderate. Not appropriate for GP practice; no precise definition of DA

Gallagher et al. 2011⁵² – Liverpool algorithm Based on case reports in children's hospitals in the UK	✓	✓	✓	✗	✓	✗	✓	✓	✓	✓	✗	✗	In comparison with the Naranjo algorithm more patients can be classified as ‘definite’ causal relationship	High

Gonzalez et al. 1992⁵⁶ Population: patients with suspected beta-lactam reaction Setting: Allergy Department, Hospital Universitario Reina Sofia, Cordoba, Spain	✗	✗	✗	✗	✓	✗	✗	✓	✓	✗	✗	✗	Scores based on 3 parameters only: clinical symptoms, aetiology and lab tests	Moderate. No validation; all factors not considered; no precise definition of DA

Son et al. 2011¹⁵⁴ Population: patients with cutaneous ADRs Setting: South Korea	✓	✓	✓	✓	✗	✓	✓	✓	✓	✓	✓	✗	To evaluate the accuracy of a Korean algorithm which was developed because, ‘…algorithms used in foreign countries with different genetic backgrounds, investigation and level of awareness for ADRS and as such they might not be suitable for use in Korea.’ This algorithm correlated well with Naranjo.	High

Trewin 1991¹⁶¹ Population: elderly patients with suspected drug allergy Setting: Pharmacy Department, Royal Devon & Exeter Hospital	✗	✗	✓	✗	✗	✓	✗	✓	✓	✗	✓	✗	Included in this algorithm for the elderly was also ‘medication compliance’ and ‘source of medication’ details. In the number and type of ADRS identified only 2 were ‘rash’ and there was no reference to drug allergy.	Moderate Single author of algorithm; no validation; no precise definition of DA; all factors not considered.

Abbreviations:

TTO or temp seq: time to onset or temporal sequence
Prev exp or drug info: previous experience or information on drug
Alter aetiologies: alternative aetiological candidates (underlying illnesses, new illnesses, non-drug therapies and diagnostic tests and procedures)
Dechallenge: drug discontinued or reduced in dosage
Response pattern to drug: clinical manifestation – symptoms improve with treatment
Background epi or clin info: background epidemiological or clinical information
ADR char or mech: characteristics or mechanisms of adverse drug reaction
Other: other factors
Quality Assessment: 9 criteria for quality assessment. Scoring of positive responses: 7–9 high quality; 4–6 moderate quality; 1–3 low quality

Table 9Probabilistic or Bayesian approaches to causation used

Author	TTO or temp seq	Prev exp or drug info	Alter aetiologies	Drug level or evidence of overdose	Challenge	Dechallenge	Rechallenge	Response pattern to drug	Confirmed lab evidence	Concomitant drugs	Background epi or clin info	ADR char or mech	Other
Hutchinson et al. 1991⁷⁰	✓	✗	✗	✗	✗	✗	✓	✓	✓	✓	✗	✗	✗
Lanctot et al. 1995⁹³	✓	✓	✗	✗	✗	✓	✓	✗	✗	✗	✓	✓	✗
Lane et al. 1987⁹⁴	✓	✗	✗	✗	✗	✗	✓	✓	✓	✗	✗	✓	✗
Mashford 1984	✓	✗	✗	✗	✗	✗	✗	✗	✓	✗	✗	✗	✓
Theophile et al. 2013¹⁶⁰	✓	✗	✓	✗	✗	✓	✓	✓	✓	✓	✗	✗	✓

Abbreviations:

TTO or temp seq: time to onset or temporal sequence
Prev exp or drug info: previous experience or information on drug
Alter aetiologies: alternative aetiological candidates (underlying illnesses, new illnesses, non-drug therapies and diagnostic tests and procedures)
Dechallenge: drug discontinued or reduced in dosage
Response pattern to drug: clinical manifestation – symptoms improve with treatment
Background epi or clin info: background epidemiological or clinical information
ADR char or mech: characteristics or mechanisms of adverse drug reaction
Other:: other factors

Table 10Studies comparing algorithms

Reference	Algorithms compared	Sensitivity	Specificity	Positive (negative) predictive values	Concordance with allergy diagnosis	Concordance with other algorithms
Benahmed et al. 2005¹³	Begaud Jones Naranjo	Begaud: 8.3% Jones: 50% Naranjo: 0%	Begaud: 98.3% Jones: 53.3% Naranjo: 100%	Begaud: 50.9% (83.5%) Jones: 18.5% (83.4%) Naranjo: 0% (100%)	Begaud: No concordance, k=0.12 Jones: No concordance, k=0.14 Naranjo: No concordance, k=0.14	Jones and Naranjo: perfect concordance (k=1) but the Jones method showed a substantial trend in favour of higher scores for the cases. Begaud: No concordance (k=0)
Busto et al. 1982²⁰	Kramer (ASS) Naranjo (APS)					High inter-rater reliability when both methods were used: Scores obtained with APS were highly correlated with those obtained with ASS by both raters: r=0.86 and r=0.81 respectively. Time spent using the ASS was slightly but significantly longer than that using the APS (9.52 (±3.02) minutes versus 8.94 (±3.51) minutes)
Kane-Gill et al. 2012⁷⁶	Jones Kramer Naranjo					The level of agreement between algorithms have kappa values all >0.7 between individual instruments with the Naranjo criteria versus Kramer algorithm having the highest kappa score, which is considered excellent agreement.
Michel & Knodel 1986¹¹³	Kramer Naranjo					Agreement between Kramer and Naranjo was 67% with kappa=0.43; Kramer versus Jones was 67% agreement with k=0.48; Naranjo versus Jones was 64% agreement with k=0.28.

Pere et al. 1986¹³⁴	Begaud Emanueli Kramer Naranjo	Weightings of criteria: Criteria are not highly sensitive (0.41<Sens<0.70)	Weightings of criteria: Criteria are not highly specific (0.18<Spec<0.63)			Concordance between methods is better than with chance but never more than moderately (0.40<kappa<0.60). Kramer versus Naranjo (k=0.51).
Theophile et al. 2013¹⁶⁰	Probabilistic method Liverpool Naranjo	Probabilistic method: 0.96 Naranjo and Liverpool were identical with 2 scores calculated depending on whether ‘possible’ was considered in favour or disfavour of drug causation: 1 or 0.42	Probabilistic method: 0.56 Naranjo and Liverpool: 0.11 or 0.89	Probabilistic method: 0.92 (0.71) Naranjo and Liverpool: 0.86 or 0.95 (1 or 0.22)	Logistic method gave results closer to expert opinion and the Liverpool and Naranjo algorithms depended on the interpretation of the ‘possible’ category of cases.	Naranjo and Liverpool performed similarly with more cases of ‘definites’ in the latter.

Table 11Frequency causality criteria were used across 32 algorithms and probability scores (25 from the systematic review and 7 added in the current review)

Assessment criteria	Included in algorithms, n/total (%)
1. Time to onset or temporal sequence	24/32 (75%)
2. Response pattern to drug (clinical response)	24/32 (75%)
3. Rechallenge	22/32 (69%)
4. Alternative aetiological candidates	17/32 (53%)
5. Confirmed by laboratory evidence	16/32 (50%)
6. Drug level or evidence of overdose	15/32 (47%)
7. Dechallenge	14/32 (44%)
8. Background epidemiological or clinical information	12/32 (38%)
9. Previous exposure or drug information	12/32 (38%)
10. Concomitant drugs	12/32 (38%)
11. Challenge	10/32 (31%)
12. ADR characteristics or mechanism	8/32 (25%)

Recommendation	When assessing a person presenting with possible drug allergy, take a history and undertake a clinical examination. Use the following boxes as a guide when deciding whether to suspect drug allergy. Boxes 1–3 Signs and allergic patterns of suspected drug allergy with timing of onset^c Be aware that the reaction is more likely to be caused by drug allergy if it occurred during or after use of the drug and: the drug is known to cause that type of reaction or the person has previously had a similar reaction to that drug or drug class. Be aware that the reaction is less likely to be caused by drug allergy if: there is a possible non-drug cause for the person's symptoms (for example, they have had similar symptoms when not taking the drug) or the person has gastrointestinal symptoms only.
Relative values of different outcomes	The following outcomes were identified by the GDG as important for decision-making: mortality, number of repeat drug allergic reactions, length of hospital stay, acute admission or readmission into secondary care, number of contacts with healthcare professionals, inappropriate avoidance of drugs, health-related quality of life. The group noted that no evidence was identified that directly addressed the effectiveness of algorithms in terms of the clinical outcomes specified, but the evidence instead focused on causality criteria with associated scores in developing an algorithm.
Trade-off between clinical benefits and harms	The group agreed that the benefit of an algorithm for the assessment of signs and symptoms is that it can help in identifying whether the reaction observed is likely to be caused by a drug. However, in the group's opinion, the key potential harm of recommending the use of an algorithm to people with a suspected drug allergy is the poor predictive value provided by algorithms. Specifically, the lack of absolute prediction of whether the person presenting with a suspected drug allergy is experiencing an allergic reaction or not and the risk of clinicians providing false reassurance was a key concern. The GDG noted that signs and symptoms of drug allergy in children may differ from those in adults, and typical patterns suggesting an allergic reaction to a drug may not apply in a child's case. For example, non-specific rashes are more common in children and these are usually not due to drug allergy, whilst severe cutaneous reactions are less common in children. The GDG also recognised that people of certain ethnicities and those with certain comorbidities such as cystic fibrosis or HIV are at higher risk of allergic reaction to specific drugs or drug classes.
Economic considerations	No relevant economic evidence was identified. The GDG did not prioritise this question for original economic analysis. The GDG agreed that the proposed assessment would most likely be carried out as part of an initial GP (or other non-drug allergy specialist) assessment, but could take longer than current practice (which generally involves noting an adverse reaction, rather than assessing the reaction and investigating the possibility of an allergy). Therefore, there may be a small increase in initial cost. However, the GDG felt that appropriate assessment would be of great clinical benefit to the person with a suspected drug allergy, as it would be likely to improve the accuracy of diagnosis. Accurate diagnosis will improve quality of life, and reduce the later costs associated with incorrect labelling of drug allergy (such as those incurred by patients who are unnecessarily given alternative second-line drugs, which are often more expensive and less effective than the first-line option). Appropriate assessment using the recommendations above will therefore assist selection of the appropriate treatment strategy for each person with a suspected drug allergy, and therefore promote economic efficiency of the clinical pathway. The GDG agreed that carrying out the assessment when the patient first presents with a potential allergic reaction would lead to the best clinical outcomes, as details of the reaction are likely to be documented more accurately than if left to a later stage. Overall the GDG agreed that the benefits (improvements in quality of life and reduced future costs) of the signs and symptoms checklist would outweigh the small upfront cost of a longer initial consultation.
Quality of evidence	The aim of the review of algorithms was to identify common signs and symptoms that indicate whether a person may have a drug allergy. The evidence showed that a number of the algorithms did not specify such patterns but focused on the types of questions that physicians need to consider when trying to identify whether the drug caused the reaction. The NICE quality assessment tool for systematic reviews was applied to the published systematic review. A further tool was designed to assess the quality of algorithm studies added to this review. The studies included in the review were assessed as good to moderate quality. However, since the algorithms that were reviewed did not always address signs and symptoms directly, the evidence was given less value in drawing up the recommendations. The GDG advised that not all the algorithms reviewed were applicable to primary care as they required too much time for a GP to use during standard consultations, required challenge testing, or did not result in a final clinical decision for managing a patient. The GDG noted that the algorithms included in the review looked at adverse drug reactions and not at drug allergy specifically and were not assessed for effectiveness in clinical settings.
Other considerations	The GDG concurred with the conclusion of the Agbabiaka³ systematic review that clinical judgement is still required when using an algorithm as a decision-making tool, and that no single algorithm is accepted as a gold standard. The GDG noted that the Naranjo¹²¹ and Kramer⁸⁶^,⁸⁷ studies were the most commonly referred to within the literature, and the study by Jones which compared the 2 favoured the Naranjo algorithm.¹²¹ The European Network for Drug Allergy questionnaire (Bousquet 2009¹⁸) was a large study designed for use by GPs and was assessed as being of high quality. However, no study had addressed how effective these tools were within a clinical practice setting, and the GDG thought that none of the algorithms were practical for use in general practice or other non-specialist settings. Most of the studies did not assess the clinical effectiveness (that is, directly leading to improved patient outcomes) of algorithms against each other or against other methods of diagnosis. This evidence would have been included but no further studies were identified. The GDG noted the difficulty of capturing the wide range of drugs and reactions to drugs in a single decision-making tool, and that as drug allergy is a subset of adverse drug reaction, it was difficult to identify drug allergy using an adverse drug reaction questionnaire such as the tools produced by Kramer⁸⁶^,⁸⁷ or the European Network for Drug Allergy (ENDA).¹⁸ The GDG suggested alternatives which may be more effective, such as checklists, pathways or flow charts. The GDG questioned the helpfulness of a probability score as used in the ENDA questionnaire¹⁸ because it does not lead to a decision for the clinician. However, the group did think a checklist of common symptoms may be helpful and further agreed that any decision tool should ideally be short, easy to use and include a score that would determine the action to be considered by the clinician. The group cited the use of the CHADS2 system¹²⁰ (congestive heart failure, hypertension, age ≥75 years, type 2 diabetes and previous stroke or transient ischemic attack), which is used as a predication rule for atrial fibrillation. Although no suitable scoring system was identified from the review, the development of a validated algorithm or decision rule including a scoring system for use within non-specialist settings would be a helpful guide in assessing and managing people who have had a suspected allergic reaction to a drug. The GDG agreed the common signs and symptoms listed in the ENDA study¹⁸ could be adapted and used as a basis for the recommendations. The GDG acknowledged the questions used within the Naranjo paper are for use within a specialist setting,¹²¹ however they believed some of these were also relevant for use within a non-specialist setting and would be a helpful addition to the recommendations as a part of the initial assessment and decision-making process undertaken by the clinician. Providing timings of when signs and symptoms are likely to occur after exposure to a drug was thought to be helpful when making an assessment. The group arrived at the timings given in the recommendations through informal consensus based on their clinical experience and knowledge of the literature in this area. The GDG noted that currently, adverse reactions are listed in the information provided with most drugs and these reactions are categorised from common reactions, to less common reactions and rare reactions.

c: Note that these boxes describe common and important presenting features of drug allergy but other presentations are also recognised

6Assessment

6.1. Review question: What is the clinical and cost effectiveness of clinical probability scores or algorithms in identifying or excluding drug allergies?

Table 6

6.2. Clinical evidence

6.2.1. Algorithms

6.2.2. Probability scores

Table 9

6.2.3. Comparative studies

6.2.4. Most commonly used algorithm criteria

Table 11

6.3. Economic evidence

Published literature

6.4. Evidence statements

Clinical

Economic

6.5. Recommendations and link to evidence

Table

Publication Details

Copyright

Publisher

NLM Citation

Table 6Characteristics of review question

Table 7Criteria to assess the association between a reaction and a drug: studies included in the systematic review (adapted from Agbabiaka et al. 20083)

Abbreviations:

Table 8Criteria to assess the association between a reaction and a drug: studies not included in the systematic review (adapted from Agbabiaka et al. 20083 with additional notes and quality ratings in the final 2 columns)

Table 9Probabilistic or Bayesian approaches to causation used

Abbreviations:

Table 10Studies comparing algorithms

Table 11Frequency causality criteria were used across 32 algorithms and probability scores (25 from the systematic review and 7 added in the current review)

Table 7Criteria to assess the association between a reaction and a drug: studies included in the systematic review (adapted from Agbabiaka et al. 2008³)

Table 8Criteria to assess the association between a reaction and a drug: studies not included in the systematic review (adapted from Agbabiaka et al. 2008³ with additional notes and quality ratings in the final 2 columns)