Methods

Raashid Luqmani; Ellen Lee; Surjeet Singh; Mike Gillett; Wolfgang A Schmidt; Mike Bradburn; Bhaskar Dasgupta; Andreas P Diamantopoulos; Wulf Forrester-Barker; William Hamilton; Shauna Masters; Brendan McDonald; Eugene McNally; Colin Pease; Jennifer Piper; John Salmon; Allan Wailoo; Konrad Wolfe; Andrew Hutchings

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Luqmani R, Lee E, Singh S, et al. The Role of Ultrasound Compared to Biopsy of Temporal Arteries in the Diagnosis and Treatment of Giant Cell Arteritis (TABUL): a diagnostic accuracy and cost-effectiveness study. Southampton (UK): NIHR Journals Library; 2016 Nov. (Health Technology Assessment, No. 20.90.)

Cover of The Role of Ultrasound Compared to Biopsy of Temporal Arteries in the Diagnosis and Treatment of Giant Cell Arteritis (TABUL): a diagnostic accuracy and cost-effectiveness study

The Role of Ultrasound Compared to Biopsy of Temporal Arteries in the Diagnosis and Treatment of Giant Cell Arteritis (TABUL): a diagnostic accuracy and cost-effectiveness study.

Show details

Contents

< Prev Next >

Chapter 2Methods

Summary of study design

The study used a prospective cohort design and recruited patients with suspected GCA who were undergoing a TAB, the standard diagnostic test, as part of their routine care in order to assist with establishing the diagnosis. Patients were recruited following referral from their primary care physician or a secondary care physician and consented to have an additional diagnostic test, namely an ultrasound investigation of their temporal and axillary arteries, before having their biopsy. The clinician treating the patient, as well as the patient, was blinded to the results of the ultrasound. Patients were assessed at presentation, at 2 weeks and after 6 months. The performances of TAB and ultrasound were evaluated against a reference diagnosis derived from the clinician’s final diagnosis, which included any changes to the diagnosis during the follow-up period, such as the emergence of any GCA-related complications. The reference diagnosis confirmed the clinician’s final diagnosis using an algorithm based on the ACR classification criteria; any unconfirmed cases (and all cases in which the ultrasound result was unblinded and seen by the clinician) were independently reviewed by a panel of experts.

Agreement between sonographers and between pathologists in their interpretation of videos and images was assessed in an inter-rater agreement exercise for a sample of recruited patients. Clinical vignettes for these patients were constructed and assessed by clinicians to see what decisions about diagnosis and treatment might have been made if ultrasound results were provided instead of biopsy results. The cost-effectiveness of the different tests and combinations of tests was assessed in an economic evaluation.

Patient and public involvement

Advice on study design was sought and obtained from patients through the registered charity Polymyalgia Rheumatica & Giant Cell Arteritis UK. Patient representatives on the Trial Steering Committee and the Data Monitoring Committee provided valuable advice and input during the study (see Acknowledgements).

Recruitment of sites

Sites were eligible to take part in the study if they were responsible for seeing patients with suspected GCA and used TAB as a routine test for its diagnosis. Sites were not eligible if they used ultrasound for diagnosing GCA as part of their routine practice.

Prior to study commencement, 19 hospitals in England indicated their interest in becoming study sites for potential recruitment. Sites were eligible to take part if a site principal investigator, typically a clinician (e.g. a rheumatologist or ophthalmologist) involved in the management of patients with GCA, could be identified who would have overall responsibility for the site’s involvement in the study. Sites also needed to be able to identify the minimum of one pathologist who would have responsibility for assessing TABs and one sonographer with responsibility for performing and assessing ultrasound. Study sonographers needed to have some previous experience in the use of ultrasound but did not need to have specific experience in ultrasound of the temporal or axillary arteries for GCA. Sonographers could come from a variety of clinical disciplines and included rheumatologists, radiologists and radiographers. Sites also needed to provide assurance that, for any individual patient, the roles of the sonographer and the clinician managing the patient were separate. This was to prevent the managing clinician from knowing the results of the ultrasound scan, except when specifically allowed in the study protocol. It did not preclude a clinician (e.g. a rheumatologist who carries out ultrasound) from performing either role in different patients provided that the separation of responsibilities was maintained for each participant.

All sites needed to obtain the relevant local approvals before training could be commenced. Site participation required sonographers to successfully complete a training package in ultrasound for GCA. No training was provided to the site surgeons, who were asked to perform the biopsies as part of routine care, or to pathologists, given that TAB specimen assessment is part of standard care. At some sites, additional clinicians were involved in the management of study patients and this was a requirement if the site’s principal investigator was designated as the study sonographer to ensure that the ultrasound result was blinded for all patients. Research nurses at each site were responsible for co-ordinating recruitment and arranging tests to ensure that both ultrasound and biopsy procedures could be performed within 7 days of commencing high-dose glucocorticoid therapy. All these site personnel comprised the local TABUL team with responsibility for co-ordinating the study locally and completing the clinical, pathology and ultrasound data collection. The process for ultrasound training is described in the next section.

Each site was provided with study training during an initiation visit from the central TABUL study team which consisted of advice on data collection (including completion of study forms) and the process for submitting data. Specific training was provided on the completion of two measures used to assess patients: the Birmingham Vasculitis Activity Score (BVAS) and the Vasculitis Damage Index (VDI). Clinicians and research nurses were required to achieve test scores of 85% for the BVAS and 75% for the VDI (and at least 50% of all individual cases had to be correct) before they were approved for scoring the two measures. Monitoring visits were conducted as per the study standard operating procedures to ensure that the correct procedures were being followed.

Training in ultrasound for giant cell arteritis

Ultrasound assessment of temporal arteries is an established technique for the diagnosis of GCA but there is no standardised protocol in widespread use. We therefore developed a training package for performing and analysing ultrasound scans for the TABUL study. The purpose of the training package was to provide assurance that the sonographers in the study had achieved competence in scanning the temporal and axillary arteries and interpreting the results before recruiting patients to the study.

The training package included a standardised protocol for performing ultrasound in the TABUL study and an accompanying presentation. Sonographers’ competence in ultrasound for GCA was assessed in three ways: (1) undertaking ultrasound assessment of 10 patients or volunteers without GCA; (2) passing an examination that tested each sonographer’s competence in interpreting ultrasound videos; and (3) successfully completing a ‘hot case’ ultrasound assessment of a patient with active GCA. Sonographers were encouraged to attend the TABUL training day for sonographers in Oxford and/or participate in site visits from the TABUL study team. After successful completion of training, sonographers were required to submit recorded scans of recruited patients for ongoing assessment of competency in scanning and interpretation.

Sonographers were required to complete all components of the training before they were deemed eligible to assess patients recruited to the main study. An exception was made for sonographers who were already performing routine assessment for GCA; these sonographers were required to undergo part of the training protocol by scanning 10 control cases and completing their online assessment. These sonographers were exempt from completing the ‘hot case’ assessment on the merit of their curriculum vitae, which was assessed by the ultrasound experts for the study.

Ultrasound protocol and training requirements

The standard protocol for ultrasound and training was set out in the standard operating procedure for ultrasound and is available via the NIHR Journals Library website (www.journalslibrary.nihr.ac.uk).

The study required the use of a linear probe with a grey-scale frequency of 10 MHz or greater and a colour Doppler frequency of at least 6 MHz, using a vascular pre-set and applying colour Doppler mode as opposed to power Doppler mode. It was important to ensure that the focus was positioned around 5 mm below the skin surface for temporal artery ultrasound, in order to detect the artery. Grey-scale frequency was required to be > 10 MHz and the pulse repetition frequency was set at approximately 2–3 kHz. This was dependent on machine and vessel and would need to be altered according to the velocity of flow because this differs from artery to artery. The colour box required angle correction of at least 60º to avoid poor colour Doppler signals and inaccurate readings. The gain setting had to be adjusted to be able to just fill the lumen with colour to avoid over- or under-filling, therefore creating a potential halo or ‘bleeding’ over the vessel wall, which might give a false reading. We did not routinely employ a compression test to occlude the artery completely to eliminate flow; however, this is a useful test and was described to all sonographers to facilitate distinction between a true halo sign and a false one.⁶³

Each site sonographer was required to register the model number and manufacturer of his or her ultrasound machine with the TABUL office to ensure that it was of sufficiently high resolution for the purposes of the study; this was also reported for the subsequent economic analysis. If the sonographer changed the machine, he or she was required to inform the central TABUL office of the change, and the TABUL office had to confirm that the machine that had been substituted was of sufficiently high quality for the study.

The protocol required each patient to lie in a recumbent or semirecumbent position on their side and pull back their hair behind their ears. Gel was applied to the area of the temporal artery and the probe was placed over the middle of the common superficial temporal artery at the level of the tragus, and the position of the probe was adjusted if necessary to locate the artery. The probe was applied in the transverse and subsequently the longitudinal plane or vice versa. After completing a sweep of the artery in one plane, the probe was rotated by 90° and a further sweep was performed in the opposite plane. The level of the bifurcation between frontal and parietal branches of temporal arteries serves as the marker point to define the start of the frontal and parietal branches, respectively. The patient was then asked to turn over to the other side so that the opposite temporal artery could be scanned. The axillary artery was examined by asking patient to remove outer clothing to expose the axilla. Gel was applied to the inner aspect of the upper arm and the ultrasound probe was placed over the midaxillary line, and swept along the expected course of the artery. The probe was applied in either the longitudinal or the transverse plane and swept along until the brachial artery branch was identified. The sweep was then repeated with the probe rotated at 90º, so that both longitudinal and transverse scans were performed. A longitudinal static image was obtained for normal cases and a transverse and longitudinal static image was obtained for abnormal cases.

The sonographers were required to sequentially scan the complete length of common superficial temporal arteries with their frontal and parietal branches in transverse and longitudinal views. The axillary arteries were also assessed in transverse and longitudinal views. The assessors were required to provide video and static images in both transverse and longitudinal planes as evidence that they had adequately scanned arteries. Each video or still image had to be labelled with the patient’s study identification number, and the location of the image was defined using the standard formatting abbreviation listed in Table 1; for example, a video sweep image of the transverse view of the left temporal artery was labelled LTSN.

TABLE 1

Abbreviations used to define ultrasound arterial sites and abnormalities found in the TABUL protocol

The minimum recordings consisted of a 10-second transverse sweep along the length of each of the temporal arteries up to and beyond the bifurcation of the frontal and parietal branches and a still image of each axillary artery. All images had to be scanned using colour Doppler to assess for complete filling of the vessel and accurate assessment of stenosis, and aliasing of colour within the vessel. Doppler pulse wave was used to further characterise any areas of stenosis. The sonographers were asked to report the presence or absence of any abnormalities for each of the temporal and axillary arteries on the ultrasound case report form (see Appendix 1) while they were scanning and to indicate the relevant section(s) for abnormalities in the temporal arteries.

If any abnormality was detected, then additional information by artery and section was collected in the case report form and recordings of the abnormalities were required. For a halo, the sonographer reported the maximum thickness and length and whether or not it ran along the entire length of the section. A 3-second transverse and longitudinal video was recorded to support evidence of any reported halo, stenosis or occlusion in sections of the temporal artery. A transverse and longitudinal still image was recorded to demonstrate halo or occlusion in either axillary artery. If stenosis was reported then the velocity in and out of the stenosis (and the minimum and maximum luminal diameter for axillary arteries) was reported and a longitudinal still image and Doppler pulse wave were recorded. The presence of arteriosclerosis was reported separately as an abnormality but no images of this were required. On completion of the scanning, the sonographer was required to document whether or not the ultrasound results were consistent with a diagnosis of GCA. The completed case report forms and recordings (on compact disc) were submitted to the TABUL office.

We expected the scanning protocol to take between 20 and 45 minutes for each patient. The start time, end time and total scanning time were collected for each training case or patient. The protocol also required the sonographer to ensure that the results of the ultrasound, the case report form and the recordings were not given to, or discussed with, the clinical staff involved in treating the patient. Each site was supplied with guidance on how to perform the scans (see Appendix 2).

Ultrasound training programme

Although the biopsy of temporal arteries has been an established test in widespread use all over the world for decades, the use of ultrasound as a diagnostic test is much more limited. Very few of the sites involved in the study had sufficient expertise to undertake proficient vascular ultrasound scanning for GCA. We therefore developed a pragmatic training programme consisting of attendance at a training day or a site visit with hands-on training. Competence in ultrasound was assessed using a video examination to correctly identify normal or abnormal scan appearances, evidence of successfully performed scans of 10 healthy control subjects, and evidence of a successfully performed scan of at least one patient with scan findings of active GCA. Sonographers were allowed to take part in the study only once all elements had been successfully completed. In addition, we required sonographers to submit recordings of scans from all patients recruited into the study for ongoing quality control.

Ultrasound protocol training was provided during a training day in Oxford at the start of the study or at site visits by the TABUL study team. The protocol and training emphasised the importance of keeping the ultrasound result blinded from the clinician treating the patient. Sonographers were also provided with a presentation on how to scan temporal and axillary arteries to look for evidence of GCA and how to document the site and nature of the findings using standardised abbreviations (see Table 1). The presentation was developed with the supervision of one of the authors (WAS) who had extensive expertise in GCA ultrasound. The presentation provided information on recommended techniques and described the minimum equipment required to perform optimal scanning.

Video examination

An online assessment was developed specifically for the study and consisted of groups of ultrasound images of 20 cases representing patients with or without active GCA. The cases comprised still images and videos of approximately 10 seconds’ duration from consenting patients (not part of the TABUL study), supplied by two of the authors (WAS and BD). Sonographers could view the images by accessing a secure password-protected online site designed for the study. For each case, the sonographer was required to indicate the presence or absence of hypoechoic vessel wall oedema (the ‘halo’). Sonographers submitted their responses to the online system for marking; they had to achieve a minimum of 75% correct answers to pass the evaluation. Sonographers who failed to pass the test at their first attempt were required to repeat the entire test or specific questions, depending on how many errors they had made.

Scanning training cases

Sonographers’ competence in performing ultrasound was assessed by their provision of satisfactory scans from 10 healthy or non-GCA training cases. All training case participants were screened and consented prior to the ultrasound scan. Training cases had to be at least 50 years old and willing to attend for an ultrasound scan of their temporal and axillary arteries. Anyone with suspected GCA or a history of diagnosed or suspected GCA was ineligible, as were patients with any inflammatory condition or anyone who had taken systemic steroids or immunosuppressants in the previous 3 months.

Scanning followed the process described in the protocol. Briefly, the sonographer was required to provide correctly labelled (and anonymised) video images of both temporal and axillary arteries from 10 individual training cases, with documentation of the findings in the case report form. The case report forms and recordings were reviewed by four expert sonographers (WAS, BD, EM, APD), who assessed the sonographers’ competence and provided feedback. Sonographers were required to assess additional cases as specified by the reviewer if there were concerns over their scanning. If any of the control patients showed any evidence of an abnormality consistent with GCA then the general practitioner (GP) of the individual would be informed of the result.

Assessment of a patient with active giant cell arteritis (‘hot case’)

All sonographers were required to scan at least one patient who had active GCA as part of their training assessment in order to demonstrate competence in detecting and reporting the abnormal findings. The ‘hot case’ patient was consented to the study using NHS or local hospital consent but could not be a patient recruited to the main TABUL study. The sonographer scanned the patient, completed the case report form and submitted recordings following the ultrasound protocol. The expert reviewers assessed the submitted recordings and case report form to ensure that (1) the ultrasound features were consistent with GCA and (2) that the appropriate images had been recorded, were of suitable quality and were consistent with the case report form. If the reviewers were not satisfied then the sonographer was required to complete another ‘hot case’ and resubmit.

Monitoring ultrasound during the study: quality control by expert review

Once a sonographer had successfully completed and passed all three components of the training assessment, they were approved to scan patients with suspected GCA who were recruited to the study. In order to ensure that the quality of scanning was maintained, a process of ongoing quality control was developed and implemented. The ultrasound case report forms and recordings for each patient were submitted and reviewed by at least one of the four expert reviewers. Recordings were uploaded to a central ultrasound database which allowed remote access for reviewers. Reviewers assessed the quality of images collected and their agreement or otherwise with the sonographer’s interpretation of the recordings. If the expert reviewers had concerns about the performance of a sonographer, then the sonographer was required to undergo additional training before being approved for scanning patients in the study.

All recruited patients had their scans reviewed unless no uploaded images were submitted. At least one expert sonographer reported their agreement, disagreement or uncertainty with the assessment made by the sonographer and, if uncertain, an indication of whether or not this was attributable to concerns over the quality of the scanned images that were submitted.

Study population, recruitment and sampling

The study aimed to recruit all eligible patients who were undergoing a TAB for suspected GCA. Patients were eligible if there was a clinical suspicion of a new diagnosis of GCA and the treating clinician had decided that the patient required an urgent TAB to help determine whether or not the diagnosis was GCA. No particular symptoms were specified, although it was expected that patients would have typical symptoms of GCA such as a new onset of headache, scalp tenderness, elevated CRP level or ESR, jaw or tongue claudication or visual loss. Patients had to be at least 18 years of age and be willing to attend for an ultrasound scan of their temporal and axillary arteries.

Patients were not eligible for the study if they had had a previous diagnosis of GCA or if it was not possible to arrange for their ultrasound and biopsy to be performed within 7 days of starting higher doses of glucocorticoids (defined as > 20 mg of oral prednisolone or equivalent daily). Patients were also ineligible if they had prolonged use (> 1 month) of higher dose glucocorticoids (> 20 mg of prednisolone or equivalent per day at any time) within the previous 3 months for any condition other than PMR. A current or previous diagnosis of PMR or presenting symptoms of PMR were not exclusion criteria, because this group of patients would be likely to require investigations for possible associated GCA, if they presented with new features suggesting the diagnosis. No other selection criteria were used for the recruitment of patients.

All patients were required to give written informed consent. Additional consent was required to allow serum, plasma and deoxyribonucleic acid samples to be taken at the first assessment and serum and plasma to be taken at the second and third assessments for future, currently undefined studies. Patients were also invited to consent to allow their remaining tissue biopsy samples (not required for diagnosis) to be stored centrally in the Oxford Musculoskeletal Biobank for further, future currently undefined studies. All slides that were originally required for diagnostic purposes were stored in the Oxford Musculoskeletal Biobank or returned to the site pathologists, after they had been photographed. All screened patients were allocated a unique screening number and a screening case record form (CRF) was completed for each case (see Appendix 3). All eligible patients who consented were allocated a unique study identification number.

It was expected that the majority of patients would be recruited from referrals from general practice to secondary care (either to rheumatology and/or ophthalmology on-call teams). The clinician responsible for the patient’s care obtained verbal consent from the potential patient and passed on their contact details to the local TABUL team. Following an initial telephone call the TABUL team provided the potential patients with the study invitation letter and participant/patient information sheet (see Appendix 4) and discussed the study with them. Alternatively, if a patient was attending the hospital, the study documents were given directly to them by the clinician or study team. The potential patient would then have sufficient time to read and understand the information and to ask any questions before providing written informed consent (see Appendix 5).

Study recruitment at sites was encouraged by providing study information flyers in non-patients areas of sites as an aide-memoire for research teams and clinicians. Awareness of the study was raised with rheumatologists at local, regional, national and international meetings such as the BSR, local meetings with GPs, ophthalmologists, vascular surgeons, rheumatologists and clinicians treating other forms of vasculitis. Guidance on recruitment was provided to all sites (see Appendix 6).

Sample size calculation

The sample size of 402 patients was calculated to provide 90% power at a 5% type I error rate to test the joint hypotheses that:

ultrasound has greater sensitivity than TAB (based on an assumed sensitivity of 76% for TAB and 87% for ultrasound)
the specificity of ultrasound is no less than 83% using the reference diagnosis.

The postulated sensitivity and specificity figures were based on a previous meta-analysis.¹⁶ The sample size would allow estimation of a one-sided rectangular confidence region for ultrasound false- and true-positive fractions, assuming 80% prevalence of GCA in patients having a biopsy for suspected GCA, with the sample size inflated (gamma 0.1) because of uncertainty in the ratio of cases to controls in a cohort design.⁶⁴

In order to allow for losses to follow-up (failure to have either test done, lack of a follow-up assessment or patient withdrawal) the plan was to recruit 430 participants to the study. After monitoring actual recruitment and withdrawals during the course of the study, the target recruitment was increased to the range 435–445.

Clinical data collection

Patients who were referred with suspected GCA were screened to check their eligibility for recruitment into the study. Patients who were eligible and gave informed consent had a full clinical assessment at presentation. Appointments for ultrasound scans and then biopsy were arranged and patients returned for a follow-up clinical assessment after 2 weeks (Figure 1). After the 2-week assessment and after seeing the biopsy report, the clinician (who remained blinded to the ultrasound results) decided whether or not the patient had features consistent with a diagnosis of GCA.

FIGURE 1

Flow of patients in the study. US, ultrasound; V, visit.

The result of the ultrasound was unblinded only if the clinician concluded that the patient did not have features consistent with GCA and was therefore planning to withdraw steroid therapy rapidly. The procedure for doing so is described below (see Ultrasound test results: procedure for revealing test results). Clinicians were allowed to alter their decision to withdraw steroids rapidly following unblinding of the ultrasound result. Patients attended a final clinical assessment after 6 months.

Patient assessment at presentation

The first clinical assessment at presentation collected data on demographic information, relevant conditions and past medical history, symptoms, physical examination findings, laboratory test results and medication. Clinicians were also asked how certain they were of the diagnosis of GCA (definite, probable or possible). Patient data included the patient’s age, sex, ethnicity, weight, blood pressure and smoking history. Comorbidity was assessed by reporting relevant current and previous medical history, and the assessment included specific questions on diabetes mellitus, hypertension, angina, myocardial infarction, heart failure, low trauma fractures and neoplasia.

Information on symptoms was collected separately for symptoms that the patient had experienced prior to commencing higher-dose glucocorticoid therapy, as well as symptoms present at the first assessment (if the patient had already started on glucocorticoid treatment). This allowed us to separately report whether or not the presenting symptoms had changed as a result of glucocorticoid therapy. The presence of the following symptoms (typically seen in GCA) was reported: anorexia, fatigue, fever/night sweats, localised pain in the head, scalp tenderness, swelling over the temporal artery, pain over the temporal artery, jaw claudication, tongue claudication, reduced or lost vision, double vision and amaurosis fugax. Symptoms of PMR (early-morning stiffness lasting longer than 1 hour, bilateral shoulder pain and bilateral hip pain) were also collected. In addition, any other symptoms that the clinicians thought were relevant could be reported manually.

Physical examination of the patient required an assessment of both temporal arteries for evidence of thickening, tenderness and reduced or absent pulsation, and of both axillary arteries for tenderness. Examination also included, if assessed, evidence of anterior or posterior ischaemic optic neuropathy, relative afferent pupillary defect, III/IV/V nerve palsy or bruits on either side and evidence of stroke, aneurysm or other features such as scalp or tongue necrosis.

The results of laboratory tests that were required for the study protocol before starting steroids and at presentation comprised ESR, CRP level and/or plasma viscosity. Additional tests included measurement of full blood count, haemoglobin, biochemistry, ANCA and urine dipstick testing if there was a clinical indication. Data were also collected on whether or not, and when, treatment with high-dose glucocorticoids for suspected GCA had been started, the route and dose and any treatment with an immunosuppressant agent. The patient was asked to complete a EuroQol-5 Dimensions (EQ-5D) 3-levels questionnaire at the assessment.⁶⁵ EQ-5D is a generic measure of health-related quality of life, necessary for the calculation of the cost-effectiveness of the two main diagnostic tests.

Patient assessment at 2 weeks and 6 months

The biopsy and ultrasound tests were completed prior to the patient assessment at 2 weeks. The results of the biopsy were provided to the clinician before the 2-week assessment but the ultrasound results were not shown. The 2-week assessment included the clinician’s assessment of the biopsy report and whether or not the biopsy was consistent with GCA. It was therefore possible for the pathologist and clinician to have different opinions on whether or not the biopsy result was consistent with GCA. The patient assessments at 2 weeks and 6 months comprised changes in current conditions and symptoms, a repeat of the physical examination performed at presentation and the results of laboratory tests.

Data for two measures of disease activity and damage were also collected at 2 weeks and 6 months. The BVAS is a validated assessment questionnaire reported by the clinician in the evaluation of disease activity in systemic vasculitis.⁶⁶^,⁶⁷ It consists of a list of clinical features that commonly occur in patients with vasculitis together with a weighted score to provide a measure of severity of disease activity; it is widely used for clinical studies and is increasingly used in the clinical management of patients with small vessel vasculitis. It can be used to define how active disease is, to measure response to therapy or to define relapsing disease⁶⁶^,⁶⁸ for the purpose of clinical trials. The most current validated version of the BVAS was used.⁶⁷ The VDI is a structured assessment to evaluate damage occurring in patients diagnosed with systemic vasculitis.⁶⁹ It is a record of irreversible consequences of having a diagnosis of vasculitis. Items are reported in the VDI if they have been present for at least 3 months and have occurred since the onset of vasculitis. There is no attribution to cause and it has been used in large cohorts of patients with primary systemic small vessel vasculitis.⁷⁰ Data from the BVAS and the VDI can also be used to examine the possible presence of an alternative form of vasculitis. Data were also collected on weight, blood pressure, treatment with steroids and immunosuppressive drugs, and quality of life using the EQ-5D.

At the 2-week assessment, the clinician was required to state whether or not the patient had features consistent with GCA and, if responding yes, to indicate which of the following influenced the response: symptoms, signs, blood abnormalities, biopsy or other (to be specified). If the patient’s features were not consistent with GCA then the clinician was required to give at least one alternative diagnosis. After providing the clinical diagnosis at 2 weeks, in the event that the clinician did not plan to continue high-dose glucocorticoid therapy because they did not think that the patient had GCA, they were required to contact the TABUL office for the ultrasound result. At the 6-month assessment the clinician was required to indicate if the diagnosis had changed and to indicate the influences for any patients in whom the decision was made to alter the diagnosis to GCA. At least one alternative diagnosis was required for any decision to alter the diagnosis away from GCA. The clinical CRF is shown in Appendix 7 and guidance on completion of the CRF is shown in Appendix 8.

Adverse events (AEs) and any attribution to either of the diagnostic test procedures were reported on AE CRFs (see Appendix 9). Guidance on completion of the AE CRFs is shown in Appendix 10.

The standard test: temporal artery biopsy

The standard test for GCA is TAB. This normally involves a minor surgical procedure to remove a small sample of temporal artery (the BSR recommends a minimum length of 1 cm⁵) which is examined for abnormalities by a pathologist. Guidance on the collection, processing and storage of biopsy samples is shown in Appendix 11. Sites followed their usual practice for obtaining and processing TABs. The only changes to routine practice required by TABUL were that sites were instructed to send the actual pathological slides used to make their diagnosis to the TABUL office and that, in addition to their standard reporting of biopsy results, pathologists were required to complete a study-specific CRF (see Appendices 12 and 13) to report their pathological findings. We did not require any specific information from any of the surgeons undertaking the biopsy but they were all informed that the patient had been recruited to the TABUL study.

The pathologist was required to report which side or sides the biopsy had been taken from as well as the length of the biopsy (after freezing or fixation), and a note was made of whether or not it was bifurcated. They were able to add other comments on the macroscopic appearance of the sample. For each biopsy, the staining protocol was reported. The macroscopic appearance was described and a note was made of whether or not the biopsy was from the temporal artery and which sections were cut. The presence of abnormalities in the intima (arteriosclerosis or intimal hyperplasia) and the internal elastic lamina (fragmentation or reduplication) were reported. Pathologists were required to indicate if there was an inflammatory infiltrate in the sample (and the predominant site of any inflammation) and indicate if any of the following features were present: normal areas, giant cells, calcification or any other unusual features. Data were also captured on presence and causes of complete occlusion of the vessel or presence of thrombus or evidence of recanalisation in at least one section of the vessel.

The pathological diagnosis was reported as either normal or any the following: compatible with a diagnosis of GCA, compatible with another vasculitis, compatible with arteriosclerosis and compatible with any other diagnosis as specified by the pathologist. The actual pathological slides were sent to the TABUL office for image acquisition. Digital image acquisition was achieved using an Aperio Scanscope Turbo AT (Leica Biosystems, Buffalo Grove, IL, USA). Slides were loaded onto the machine’s autoloader and pre-snapped to obtain a macroscopic image before proceeding with digital scanning. The macroscopic image was used to set the tissue area, focal plane, focus points, white balance, scan/slide settings and labelling description. Once the settings had been optimised the slides were scanned in fragments and digitally stitched together to form one high-resolution virtual representation of the pathology slide. These virtual slides were stored on an external physical server and a web-based database (Aperio eSlideManager V1.0, Leica Biosystems) was used to archive and store the eSlides. Slides could be viewed remotely using Aperio’s web-based viewing systems (Leica Biosystems).

The biopsy result, which was the primary standard test, was defined as positive by the pathologist if the pathological diagnosis was compatible with a diagnosis of GCA. This included patients whose biopsy samples did not contain temporal artery (e.g. vein, fat, muscle or other tissue) or for whom no sample was obtained from surgery. An alternative standard test result was defined as the clinician’s interpretation of the biopsy result as reported on the clinical CRF at the 2-week assessment. This was reported because we expected that the clinician might reach a different conclusion from the pathologist, based on the biopsy report.

The main analyses included patients who had no sample from surgery or a biopsy sample that did not include temporal artery; these were categorised as not compatible with a diagnosis of GCA. Additional analyses excluded the indeterminate biopsy results.

The index test: ultrasound of the temporal and axillary arteries

The index test, an ultrasound of both temporal and both axillary arteries, was performed following the protocol described earlier and is available on the NIHR Evaluation, Trials and Studies Coordinating Centre website (www.nets.nihr.ac.uk) and was subject to ongoing monitoring for quality assurance. The presence of ultrasound abnormalities (halo, occlusion, stenosis and arteriosclerosis) in different segments of the temporal arteries and in the axillary arteries (as defined in Table 1) was captured in the ultrasound case report form (see Appendix 1). The primary test result for ultrasound was defined as positive and was used for the main analyses if the sonographer responded ‘yes’ to the question ‘In your opinion are the results consistent with a diagnosis of GCA?’. Additional analyses used alternative definitions of a positive result based on the presence or absence of a bilateral halo and on the interpretation of the ultrasound images from the expert review.

Ultrasound test results: procedure for revealing test results

The clinician treating the patient was provided with the biopsy result but did not have access to the results of ultrasound at the 2-week assessment. Study sonographers were required to keep the results of each patient’s scans blinded from the managing clinician for the duration of the study. The only exception was if the managing clinician had completed the 2-week assessment and was planning to withdraw steroid treatment rapidly. In these circumstances the clinician was required to contact the TABUL office and was provided with the scan results as reported by the sonographer. The clinician then had an opportunity to reconsider their decision to withdraw steroids and alter their diagnosis. Thus, the 2-week assessment included a report of the clinician’s original assessment of the diagnosis and any revision following the revealing of the ultrasound result.

The reference diagnosis

The ideal reference diagnosis for evaluating diagnostic tests is one that is independent of the tests being evaluated. No such reference diagnosis exists for GCA for evaluating the performance of biopsy and ultrasound. Criteria for classifying GCA and usual clinical practice for reaching a GCA diagnosis incorporate the results of the biopsy; therefore, they cannot be truly independent methods for defining a reference diagnosis. Furthermore, the ACR classification criteria were not intended to be used as diagnostic criteria.³⁴ For the purposes of the study, a partially independent approach was used, which combined elements of a clinician’s final diagnosis, the ACR classification criteria (incorporating the biopsy result), the emergence of complications consistent with GCA during follow-up, the emergence of alternative vasculitis diagnoses during follow-up and expert review to determine the reference diagnosis. The process started with the clinician’s final diagnosis for the patient as reported on the 6-month (or in its absence, 2-week) assessment. An algorithm was devised to determine if evidence from the biopsy and the presence or absence of symptoms and emerging complications and diagnoses on follow-up supported the clinician’s diagnosis or if expert review was required to determine the reference diagnosis.

If the clinician’s final diagnosis was GCA, then a reference diagnosis of GCA was given if any of the following criteria were met:

a stricter version of the ACR classification criteria using either the standard or tree method was met based on the patient’s symptoms and physical examination from their baseline assessment (Table 2)
the emergence of PMR during follow-up in patients with no previous history of PMR and no symptoms of PMR at presentation
the emergence of new or worsening jaw claudication, tongue claudication, abnormal anterior optic neuropathy, abnormal posterior optic neuropathy, or relative afferent pupillary defect during follow-up.

TABLE 2

Definitions and sources of items in the ACR classification criteria

If the clinician’s final diagnosis was not GCA, then a reference diagnosis of ‘not GCA’ was given. If an alternative vasculitis diagnosis was made, these included Takayasu’s arteritis, large vessel vasculitis, polyarteritis nodosa, GPA, microscopic polyangiitis, EGPA, cryoglobulinemic vasculitis, IgA vasculitis (Henoch–Schönlein purpura), or any other vasculitis to be specified. A reference diagnosis of ‘not GCA’ was also given if all of the following criteria were met.

The patient failed to meet the ACR classification criteria using either the standard or tree methods (see Table 2).
No new-onset PMR occurred during follow-up.
No new or worsening jaw claudication, tongue claudication, abnormal anterior optic neuropathy, abnormal posterior optic neuropathy or relative afferent pupillary defect occurred during follow-up.
No symptom of reduced or lost vision in either eye occurred or worsened during follow-up.
No evidence of abnormal III/IV/VI nerve palsy or stroke on clinical examination was observed at 2 weeks or 6 months.
No sudden visual loss, cerebrovascular accident or cranial nerve palsy reported on the BVAS occurred during follow-up.
No retinal change, optic atrophy, visual impairment/diplopia, blindness in one eye, blindness in the second eyes or cerebrovascular accident reported on the VDI occurred during follow-up.

Any patient who was not given a confirmed reference diagnosis based on the above was referred for expert review. Furthermore, any patient who had their diagnosis altered during follow-up (typically for a diagnosis altered to GCA from not GCA following unblinding of the ultrasound report) was automatically referred for expert review regardless of any confirmed reference diagnosis given above.

The expert review group comprised five rheumatologists involved in the study. Each case requiring expert review was independently assessed by three of the five rheumatologists, and no rheumatologist could review cases from their own site. A summary report for each patient was extracted from the clinical data and included information on symptoms, GCA-related complications, items from the ACR classification criteria and the clinician’s diagnosis. Access to the clinical database was also given so that expert reviewers could examine all data collected as part of the study with the exception of the ultrasound results. Each expert reviewer independently reported their agreement or disagreement with the clinician’s final diagnosis. The clinician’s final diagnosis was supported if at least two of the experts agreed with the diagnosis. The clinician’s diagnosis was altered if all three experts disagreed with the diagnosis. If two experts disagreed with the clinician’s diagnosis then the patient was discussed by the relevant experts during a moderated teleconference until the three experts reached a consensus.

Inter-rater agreement data collection and analysis

The aim of the inter-rater agreement component of the study was to assess the extent of agreement between trained sonographers in their interpretation of ultrasound videos, and between experienced pathologists in their interpretation of biopsy images, for a sample of cases using data, videos and images from patients recruited to TABUL. Sonographers and pathologists assessed the same cases using a web-based exercise. Intrarater agreement was also assessed by repeating cases in the exercise. The impact of providing additional information about the patient was examined by including a brief vignette.

All pathologists and sonographers who assessed patients in the main TABUL study were asked to complete a web-based exercise. The exceptions were pathologists and sonographers who were involved in the management of TABUL or in the expert review of ultrasound for quality control (two pathologists and four sonographers.) Pathologists and sonographers who agreed were sent instructions for completing the exercise and a password to access the exercise.

The overall design involved a web exercise with 44 cases. Each case represented a patient recruited to TABUL and comprised ultrasound videos of both temporal arteries, scanned images of the biopsy slide and a brief clinical vignette describing the patient. The first five cases were defined as training/practice cases that allowed raters to familiarise themselves with the exercise. The remaining cases, the rating cases, consisted of 30 unique cases, six repeats of unique cases (for intrarater assessment) and three reserve cases. The reserve cases were available to replace any of the 30 cases that were subsequently found to be ineligible once the exercise had started. The overall number of cases was chosen to keep the task manageable, and the aim was to have at least 10 pathologists and 10 sonographers complete the exercise. This was to allow results to be generalised to the wider populations of pathologists and sonographers.

The criteria for including a patient’s videos and biopsy images as rating cases in the exercise were ultrasound videos of adequate quality of the right and left temporal arteries, biopsy slides received and scanned, inclusion of the patient in the main TABUL analyses and patient consent for the use of the images. Cases were ineligible if the biopsy specimen did not consist of artery or if the ultrasound was abnormal owing to axillary artery involvement without temporal artery abnormalities. Cases were also ineligible if the biopsy images or ultrasound videos included information that identified the patient or clinician involved or included markings indicating abnormality and this information could not be removed or hidden. Finally, cases were excluded if the quality of the ultrasound images was judged to be poor by expert review during quality control. Disagreement with the original sonographer’s interpretation by expert review or difficulty in interpreting the ultrasound by expert review despite adequate quality videos were not criteria for exclusion.

Identification of cases was performed in three stages before the start of the exercise because the main TABUL database and the ultrasound and biopsy databases had not been locked at the time of initial selection and because of the work involved in ensuring that videos and images were eligible. The first stage involved identifying potentially eligible cases from the list of patients recruited to the main study who had had their ultrasound videos uploaded. This list of potentially eligible cases was ordered using random numbers generated using Stata version 13 (StataCorp LP, College Station, TX, USA). The second stage involved populating the 33 rating cases from the top of the list. Any case found to be ineligible was replaced with the next available case from the list. This process was repeated until all 33 rating cases were deemed eligible. The third stage involved pilot testing of the exercise and review of all videos and images by two pathologists (BM, KW) and two sonographers (WAS, JP) to ensure that the criteria relating to the videos and images were met. The five training cases were selected purposively starting at the bottom of the ordered list. These were selected to ensure that there were at least two abnormal and two normal cases for the biopsy images and for the ultrasound videos. A final post-exercise stage involved a further eligibility check of the rating cases against the locked database. Any of the 30 rating cases subsequently found to be ineligible were replaced with one of the three reserve rating cases for inclusion in the analyses.

A web-based exercise was designed to allow remote access to the videos and images and to capture data from the assessments made by the sonographers and pathologists. Each case in the exercise began by giving access to two video images showing left and right temporal arteries (for sonographers) or one biopsy slide image for each stain available (for pathologists). Videos could be replayed as often as required and biopsy images allowed zooming for magnification at the equivalent of up to 40 times in high resolution. Raters were asked to answer yes or no to the question ‘In your opinion, do the ultrasound (or pathology) images show features of GCA?’ and to answer certain or uncertain in response to the question ‘How certain are you?’. They then gave their answers and confirmed that they were confident to submit their answers.

All cases were rated before and after seeing a brief clinical vignette describing the patient. The vignette was added to reflect a more realistic scenario for interpreting the videos or images. For example, the sonographer would see the patient in front of them when conducting temporal and axillary artery ultrasound. The pathologist might receive a brief description of the patient on the biopsy request form. The vignettes provided basic information on age, sex, glucocorticoid treatment, comorbidity, presenting symptoms and laboratory test results, for example ‘79 year old male started glucocorticoid therapy 2 days ago for suspected GCA. Patient has hypertension. Presented with new localised pain in head, jaw claudication and reduced vision. Elevated ESR and CRP’. The vignette was identical for the ultrasound and biopsy versions of each case except for the duration of glucocorticoid therapy (which varied depending on when the test was done). For repeat cases, the core information was identical to the original case but the order of wording was altered.

Cases had to be completed in order and rating cases could not be started until all five training cases had been completed. Once a rating case had been completed it was not possible to return to that case to view the videos or images or to look at the answers given. This was because six of the cases were repeated. It was possible to return to the training cases for reference. The locations of repeated cases in the 36 rating cases were assigned before the random ordering of eligible cases. Repeated cases all made their first appearance in the first 18 cases and all made their second appearance in the final 18 cases. For each of the six repeated cases there was a minimum gap of 16 cases between its first and second appearances.

Clinical vignettes data collection and analysis

The aim of the assessment of the clinical vignettes was to determine what decision about a patient’s diagnosis and treatment would have been made if there was no biopsy performed, leaving the clinician to rely on the results of the ultrasound. Two overlapping samples of cases were selected from patients recruited to the study. The first sample was the same random sample used for the assessment of inter-rater agreement. The second sample comprised all patients in the main study who had a positive biopsy and a negative ultrasound.

Clinical vignettes were structured to provide data on the patients at the times when two key decisions are made. The first is on initial presentation, when the possibility of a diagnosis of GCA is considered and a decision is taken to recommend a TAB. The second is after 2 weeks, when a decision to continue or withdraw high-dose steroids for GCA is made. Vignettes were populated with data collected during the study. Information provided at presentation comprised the patient’s age, sex, relevant current conditions and medical history, symptoms, symptom onset and any laboratory test results (ESR, CRP level or ANCA) prior to starting steroids, duration and dose of steroids, new symptoms and symptoms still present at presentation, results of the physical examination at presentation and any laboratory test results (ESR, CRP level or ANCA) at presentation. Clinicians were then asked to give their indication of the likelihood of the patient having GCA (definite, probable, possible or not GCA) and indicate whether or not, in the absence of alternative tests such as ultrasound, they would recommend this patient for a TAB.

The information at 2 weeks was presented once responses to the questions had been confirmed. Information on the vignettes comprised the results of the ultrasound test and information about the patient’s health after 2 weeks. The ultrasound test was reported as either consistent or not consistent with a diagnosis of GCA and included additional information on any abnormalities identified on ultrasound, for example ‘consistent with GCA; halo on right temporal artery; normal left temporal artery; normal axillary arteries; no occlusion or stenosis’. Other information comprised symptoms present at 2 weeks (categorised by new, worse, no change, better and resolved), results of the physical examination at 2 weeks, results from laboratory tests and any changes in current conditions. Clinicians were then asked to give their indication of the likelihood of the patient having GCA (definite, probable, possible or not GCA) and to indicate the appropriateness of continuing to treat the patient with high-dose steroids for GCA on a nine-point scale (1, extremely inappropriate; 5, uncertain; 9, extremely appropriate).

Data on the appropriateness of continuing treatment with high-dose steroids were categorised as appropriate, inappropriate or uncertain using the method outlined in The Rand/UCLA Appropriateness Method User’s Manual.⁷¹ A panel median of 7–9 without disagreement is considered appropriate, a panel median of 4–6 or any median with disagreement is categorised as uncertain, and a panel median of 1–3 is categorised as inappropriate. Disagreement was determined using the interpercentile range adjusted for symmetry and the common approach of rounding up medians of 3.5 and 6.5 was applied.⁷¹

Statistical analysis

The statistical analyses of the diagnostic accuracy of TAB and ultrasound were specified in the statistical analysis plan (see Appendix 14). Sensitivities and specificities were calculated for TAB and ultrasound in comparison with the gold standard reference diagnosis. The kappa statistic was used to assess agreement between TAB and ultrasound, and McNemar’s test was used to detect systematic discordance.

The inter-rater agreement between sonographers and between pathologists was evaluated using a two-way random-effects analysis of variance to estimate the intraclass correlation coefficients for agreement with 95% CIs. Both cases and raters were treated as random effects in order to generalise findings to all cases (from the sample selected) and to the potential population of trained sonographers (from the sample of sonographers doing the exercise). Intrarater agreement was evaluated by estimating kappa statistics for agreement and by examining agreement for the individual repeated cases and raters.

Statistical analysis was performed in Stata versions 12 and 13.

Pre-test probability of giant cell arteritis: definition of risk categories

The availability of data from the DCVAS study provided an opportunity to define categories of pre-test risk of a GCA diagnosis from an independent sample of patients and was used in preference to obtaining expert opinion elicited from clinical vignettes.⁴¹ Data on 585 patients recruited to centres not participating in TABUL, and who had had a TAB, were used to derive definitions for high-, medium- and low-risk groups. The high-risk group was defined as patients with (1) claudication of the jaw or tongue and (2) elevated ESR or CRP level (ESR of at least 60 mm/hour or CRP level of at least 40 mg/l) either at pre-steroids or at presentation assessments. The low-risk group was defined as patients (1) without jaw or tongue claudication and (2) without elevated ESR or CRP level at both the pre-steroids and presentation assessments of symptoms and laboratory tests. The remaining patients were categorised as medium risk.

Changes to the study protocol

There were two substantial amendments to the study protocol. The first amendment was made in February 2011 and comprised the following key changes.

To alter the decision always to offer each potential participant 24 hours to consider their participation in the study. This amendment was made because there were some circumstances in which treatment may be delayed while waiting for consent, for example in an emergency (to minimise delay in normal care such as performing the biopsy) or when sites are able to provide a fast turnaround time for performing the biopsy. In these circumstances we offered the opportunity for participants to provide full written informed consent in < 24 hours from receiving information about the study.
To provide further clarification on the collection of additional blood and biopsy samples during the course of the study.

The second amendment was made in February 2013 and comprised the following key changes.

To increase the target sample size for recruitment from 430 to 435–445 (with 402 completing the primary end point).
To extend the recruitment period by 12 months.
To clarify the recruitment strategy (including the production of a poster summarising the study for use in non-patient areas).
To clarify the process for the managing clinician to contact the TABUL office in order to be given the results of the ultrasound result (unblind the ultrasound result).
To allow inclusion of patients in whom the biopsies were performed more than 7 days after starting high-dose glucocorticoids because of safety concerns about when the biopsy could be performed, for example to allow discontinuation of warfarin so that it was safe to perform the biopsy. This would be part of standard care for any patient who required a biopsy but was receiving warfarin.

Copyright © Queen’s Printer and Controller of HMSO 2016. This work was produced by Luqmani et al. under the terms of a commissioning contract issued by the Secretary of State for Health. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.

Included under terms of UK Non-commercial Government License.

Bookshelf ID: NBK401237

Contents