Methods

Suzanne Belinson; Ryan Chopra; Yoojung Yang; Veena Shankaran; Naomi Aronson

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Belinson S, Chopra R, Yang Y, et al. Local Hepatic Therapies for Metastases to the Liver From Unresectable Colorectal Cancer [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2012 Dec. (Comparative Effectiveness Reviews, No. 93.)

This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Methods

In this chapter, we document the procedures that the Blue Cross and Blue Shield EPC used to produce a CER on the effectiveness and comparative effectiveness of local hepatic therapies for CRC metastases to the liver. The methods for this CER follow the methods suggested in the ARHQ Methods Guide for Effectiveness and Comparative Effectiveness Reviews (available at www.effectivehealthcare.ahrq.gov/methodsguide.cfm).

The main sections in this chapter reflect the elements of the protocol established for the CER; certain methods map to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) checklist.⁵³ We first describe the topic refinement process and the construction of the review protocol. We then present our strategy for identifying articles relevant to our KQs, our inclusion and exclusion criteria, and the process we used to extract information from the included articles and generate our evidence tables. In addition, we discuss our method for grading the quality of individual articles, rating the strength of the evidence, and assessing the applicability of individual studies and the body of evidence for each KQ. Finally, we describe the peer review process. All methods and analyses were determined a priori and documented in a research protocol that was publically posted by AHRQ for comments.

Given the clinical complexity of this topic and the evolution of the scope and the KQs, we sought input from the TEP throughout the process. In some cases, this was done through joint teleconferences; in other cases, we contacted TEP members individually to draw on each member’s particular expertise.

Topic Refinement and Review Protocol

The topic for this report was nominated in a public process. With input from technical experts, the EPC drafted the initial KQs and, after approval from AHRQ, posted them to a public Web site. The KQs were posted for 4 weeks for public comment. We modified the KQs and the key elements of PICOTS based on these comments and discussion with the TEP.

When the KQs were first written, both the questions and the interventions were stratified by intent of treatment (palliative or curative). However, this stratification seemed clinically inappropriate and potentially confusing because some interventions could be applied to palliate symptoms and to eliminate (i.e., cure) the liver metastases. Thus, the final KQs are distinguished by the population receiving local hepatic therapy. KQs 1 and 2 apply to patients whose CRC is refractory to systemic chemotherapy (i.e., their disease had progressed), and KQs 3 and 4 apply to patients who are receiving local hepatic therapy and systemic chemotherapy. To be consistent with clinical practice, we modified KQs 1 and 2 to include patients with minimal extrahepatic disease. In addition, we categorized the 12 interventions to apply to all KQs, we removed some interventions, and we added SBRT. Finally, we expanded the list of harms to be considered to include elevated alkaline phosphatase, elevated bilirubin, elevated transaminases, liver failure, and rare adverse events that had not been considered originally.

Literature Search Strategy

Search Strategy

We searched MEDLINE and Embase and the Cochrane Library. Our search strategy used the National Library of Medicine’s Medical Subject Heading (MeSH^®) keyword nomenclature developed for MEDLINE and adapted for use in other databases. We limited the searches to the English language⁵⁴ but did not limit the search by geographic location of the study. Evidence suggests that language restrictions do not change the results of systematic reviews for conventional medical interventions.⁵⁵ We also restricted the searches to articles that treated patients between January 1, 2000, and June 27, 2012, primarily to ensure the applicability of the interventions and outcomes data to current clinical practice. Prior to 2000 some interventions were in their infancy and based on current standards used outdated regimens.⁵⁶^–⁵⁸ Thermal therapies were not used significantly until late 1990s and major changes in proton beam and stereotactic therapy occurred during the same period.⁵⁹ Chemoembolization drugs and embolic mixtures have also changed a great deal in the last ten years and are more standard now. For these reasons which were strongly supported by the TEP we excluded studies where patient treatment preceded 2000.

We searched for the following publication types: RCTs, nonrandomized comparative studies, and case series. We used the following search terms for the diseases in question: CRC, metastases, and unresectable liver tumors. Appendix A gives the major search strings, including all the terms used for the interventions of interest.

We searched the gray literature for clinical trials, material published on the U.S. Food and Drug Administration Web site, and relevant conference abstracts identified by TEP members (from the American Society of Clinical Oncology, American Society of Clinical Oncology Gastrointestinal Cancers, Surgical Society of Oncology, and Radiosurgery Society). We also reviewed scientific information packets that the Scientific Resource Center had requested and obtained from relevant pharmaceutical or device firms.

Originally, we had intended to contact study authors only if the EPC staff believed that the evidence could meaningfully affect results (i.e., alter eventual grades of the strength of evidence). However, because of the limited number of studies included in this report, we elected to contact authors for any article lacking complete information on patient characteristics, interventions, or outcomes. A listing of the contacted authors is included in Appendix B.

Inclusion and Exclusion Criteria

Table 3 lists the inclusion/exclusion criteria we selected based on our understanding of the literature, key informant and public comments gathered during the topic refinement phase, input from the TEP, and established principles of systematic review methods.

Study Selection

Search results were transferred to EndNote^® and subsequently into DistillerSR (Evidence Partners Inc., Ottawa, Canada) for selection. Using the study selection criteria for screening titles and abstracts, each citation was marked as: (1) eligible for review as full-text articles or (2) ineligible for full-text review. Reasons for article exclusions at this level were not noted. The first-level title-only screening was performed in duplicate. To be excluded, a study needed to be independently excluded by both team members. In cases where there was disagreement, second-level abstract screening was completed by two independent reviewers.

Discrepancies were decided by consensus opinion and a third reviewer was consulted when necessary. All team members were trained using a set of 50 abstracts to ensure uniform application of screening criteria. Full-text review was performed if it was unclear whether the abstract met article-selection criteria.

Full-text articles were reviewed in the same fashion to determine their inclusion in the systematic review. Records of the reason for exclusion for each paper retrieved in full-text, but excluded from the review, were maintained in the DistillerSR database. Although an article may have been excluded for multiple reasons, only the first reason identified was recorded.

Development of Evidence Tables and Data Extraction

Evidence tables were constructed by clinical content experts and staff at the EPC. Tables were designed to provide sufficient information and enable readers to understand the studies and determine their quality. Emphasis was given to data elements essential to our KQs. Evidence table templates were identical for KQ1 and KQ3 and KQ2 and KQ4. The format of our evidence tables was based on examples from prior systematic reviews.

Data extraction was performed directly into tables created in DistillerSR, with elements defined in an accompanying data dictionary. All team members extracted a training set of five articles into evidence tables to ensure uniform extraction procedures and test the utility of the table design. All data extractions were performed in duplicate, with discrepancies resolved by consensus. The full research team met regularly during the period of article extraction to discuss any issues related to the extraction process. Extracted data included patient and treatment characteristics, outcomes related to intervention effectiveness, and information on harms. Harms included specific negative effects, including the narrower definition of adverse effects. Data extraction forms used during this review are presented in Appendix C.

The final evidence tables are presented in their entirety in Appendix D. Studies are presented in the evidence tables by study design, then year of publication alphabetically by the last name of the first author. Abbreviations and acronyms used in the tables are listed as table notes and are presented in Appendix E.

Risk of Bias Assessment of Individual Studies

For the assessment of risk of bias in individual studies, we followed the Methods Guide³⁸ where applicable. Our assessment of risk of bias in the included case-series intervention studies was based on a set of study characteristics proposed by Carey and Boden.⁶⁰ These characteristics include: clearly defined study questions, well-described study population, well-described intervention, use of validated outcome measures, appropriate statistical analyses, well-described results, discussion and conclusion supported by data, and acknowledgement of the funding source. The Carey and Boden assessment tool does not conclude with an overall score of the individual study. We created thresholds for converting the Carey and Boden⁶⁰ risk assessment tool into AHRQ standard quality ratings (good, fair, and poor) to differentiate case-series studies of varied quality. These distinctions are to be used for differentiation within the group of case-series studies, but not for the overall body of evidence described below. The classification into these categories (i.e., good, fair, poor) is distinct for a specific study design. Other study designs are evaluated according to their own strengths and weaknesses.

For a study to be ranked as good quality, each of the Carey and Boden⁶⁰ criteria must have been met. For a fair quality rank, one criterion was not met, and a rank of poor quality was given to studies with more than one criterion not met. These quality ranking forms can be found in Appendix D.

Data Synthesis

Evidence tables were completed for all included studies, and data are presented in summary tables. Evidence is also presented in text organized by outcome and intervention. No direct comparisons are made. We considered whether formal data synthesis (e.g., meta-analysis) would be possible from the set of included studies. Because the literature was so heterogeneous in terms of the populations (e.g., prior treatments, reason for unresectability and number and size of lesions) and interventions (e.g., drugs and dose) studied, we concluded that pooling data would be inappropriate for this review. Thus, all data synthesis is based on qualitative summaries and analyses.

Strength of the Body of Evidence

We graded the strength of the overall body of evidence for overall survival, quality of life, and harms for the four KQs. We used the EPC approach (developed for the EPC program and referenced in the Methods Guide³⁸^,⁶¹), which is based on a system developed by the Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group.⁶² This system explicitly addresses four required domains: risk of bias, consistency, directness, and precision.

The overall strength of evidence could be graded as “high” (indicating high confidence that the evidence reflects the true effect, and that further research is very unlikely to change our confidence in the estimate of effect); “moderate” (indicating moderate confidence that the evidence reflects the true effect, and that further research may change our confidence in the estimate of effect and may change the estimate); “low” (indicating low confidence that the evidence reflects the true effect, and that further research is likely to change our confidence in the estimate of effect and is likely to change the estimate); or “insufficient” (indicating that evidence is either unavailable or does not permit estimation of an effect).

Two independent reviewers rated all studies on domain scores and resolved disagreements by consensus discussion; the same reviewers also used the domain scores to assign an overall strength of evidence grade. When evidence was available but the effects could not be estimated from the body of evidence, the overall strength of evidence was rated as “insufficient.” If we could estimate comparative effects, we graded the evidence as “low,” indicating our low level of confidence in the estimates. This decision was based in large part on the biases inherent in a literature base comprising case-series studies. In this review, consistency of the body of literature was graded as “not applicable.” The direction of effect cannot be assessed in noncomparative studies; therefore, consistency in the direction of effect across case series cannot be discerned. In the absence of a comparator, we do not know if the observed estimate is better or worse; therefore, we concluded that consistency was not applicable. Directness pertains to the whether the evidence links the interventions directly to a health outcome. Due to the absence of direct comparisons precision will be rated imprecise.

Assessing Applicability

Applicability of the results presented in this review was assessed in a systematic manner using the PICOTS framework. Assessment included both the design and execution of the studies, as well as their relevance to the target populations, interventions, and outcomes of interest.

Tables

Table 3Inclusion and exclusion criteria

Category	Criteria
Study population	Patients with primary CRC and unresectable liver metastases due to lesion characteristics or underlying comorbidity For KQ1 and KQ2, patients refractory to systemic chemotherapy For KQ3 and KQ4, patients receiving local hepatic therapy as an adjunct to systemic chemotherapy
Time period	Studies with treatment dates after 2000 to represent current interventional approaches to local hepatic therapies
Publication languages	English only
Admissible evidence	Study designs All study designs Case reports that report on a rare adverse event Other criteria Extrahepatic disease permitted only if it is liver dominant Studies must involve one or more of the interventions listed in the PICOTS Studies must include at least one outcome measure listed in the PICOTS and the outcome must be extractable from data presented in the articles To allow for the inclusion of all potentially relevant evidence, studies that deviated from our inclusion criteria by less than 10% were included (e.g., 5% of patients were HCC, or 9% of patients had documented extrahepatic disease)

: CRC = colorectal cancer; KQ = Key Question; PICOTS = population, intervention, comparator, outcome, timing, setting

Bookshelf ID: NBK115727

Contents

< Prev Next >