U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Yank V, Tuohy CV, Logan AC, et al. Comparative Effectiveness of In-Hospital Use of Recombinant Factor VIIa for Off-Label Indications vs. Usual Care [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2010 May. (Comparative Effectiveness Reviews, No. 21.)

Cover of Comparative Effectiveness of In-Hospital Use of Recombinant Factor VIIa for Off-Label Indications vs. Usual Care

Comparative Effectiveness of In-Hospital Use of Recombinant Factor VIIa for Off-Label Indications vs. Usual Care [Internet].

Show details

Methods

Topic Development

The topic for this comparative effectiveness review (CER) was nominated in a public process that solicited input from professional societies, health systems, employers, insurers, providers, consumer groups, and manufacturers, amongst others. The draft Key Questions were developed by the Scientific Resource Center (SRC) on behalf of the AHRQ Effective Health Care Program and, after approval from AHRQ, were posted on a public website for public commentary. After reviewing the public commentary, as well as input from experts and stakeholders, the SRC made further revisions to the Key Questions. The Key Questions were then presented to the Stanford-UCSF Evidence-based Practice Center (EPC), and minor revisions were made on the basis of joint discussions between the Stanford-UCSF EPC, Technical Expert Panel (TEP), AHRQ, and the SRC.

Framework for Analyzing Outcomes for rFVIIa Use

Our analytic framework for evaluating the off-label use of rFVIIa is shown in Figure 1. The figure represents the trajectory of a patient who receives rFVIIa at some point during inpatient medical care. The first possible time of drug administration is in the case of prophylactic use (to limit blood loss) during a potentially bloody surgery, such as liver transplantation or cardiac surgery. The second possible time of drug administration is in the case of treatment use, which occurs as an attempt to arrest ongoing bleeding and is employed in numerous clinical scenarios, including intracranial hemorrhage and trauma. The final possible time of drug administration is in the case of end-stage use, as a last-ditch effort to salvage a patient who is dying from massive hemorrhage and for whom other interventions have failed. Repeat doses of rFVIIa are possible during any of the above applications, for example during a long surgery or for ongoing hemorrhage.

Figure 1 depicts the analytic framework for evaluating the off-label use
of rFVIIa. The figure represents the trajectory of a patient who receives
off-label rFVIIa at some point during in-hospital medical care. Possible times
for drug administration include prophylactic, treatment, and end-stage use. The
thick horizontal arrows represent the overlap between the Key Questions (KQs)
addressed by this report and the different types of rFVIIa use described above.
The potential outcomes examined in this report are shown on the right side of
the figure. These cover a range, from indirect outcomes (process/resource use
and intermediate/surrogate outcomes) to direct clinical endpoints (e.g.,
functional outcome, adverse events, or death).

Figure 1

Framework for analyzing outcomes for rFVIIa use.

Thick horizontal arrows near the top of the figure represent the overlap between the Key Questions (KQs) addressed by this report and the different types of rFVIIa use described above. For example, the bar representing KQ1 (Overall use of rFVIIa) spans the entire range of potential uses—prophylaxis, treatment, and end-stage—whereas the bar representing KQ2 (intracranial hemorrhage) encompasses only treatment use.

At the right side of the figure are examples of potential outcomes of rFVIIa use. These encompass a range, from indirect outcomes—of process/resource use or intermediate/surrogate outcomes (which are perhaps the easiest to measure but are not always closely connected to patient status)—to direct clinical endpoints such as death, adverse events, or functional outcome (which are the most relevant to patient well being but are often more difficult to measure or occur less frequently than the other outcomes). An important point to note in relation to studies of rFVIIa is that some of them presuppose the plausible—but as-of-yet unproven—assumption that cessation of bleeding, as measured via intermediate endpoints such as blood loss or transfusion requirements, is associated with improvements in direct outcomes, such as mortality. One of the goals of this analytic framework, and indeed the effectiveness reviews, is to attempt to evaluate whether this assumption (of improvements in intermediate outcomes being linked to improvements in direct outcomes) is substantiated by the evidence. Ideally, the report would focus primarily on direct clinical outcomes for each of the Key Questions, but this is not always possible given that the studies and other data sources may only report indirect outcome measures or have few events of this type.

Search Strategy

Premier Database on Hospital Use of rFVIIa

We analyzed nationally representative data on patients who received rFVIIa during a hospitalization. Our analytic goals were:

  1. To provide an overview of trends in in-hospital rFVIIa use, particularly for off-label uses.
  2. To portray the range of clinical conditions for which rFVIIa has been administered in hospital.
  3. To examine the clinical and demographic characteristics of in-hospital rFVIIa users in relation to the populations studied in comparative studies.
  4. To validate the relevance of the five indications selected by AHRQ for in-depth systematic review.

We used 2000 through 2008 data from the Perspective Comparative Database of Premier, Inc. (Charlotte, NC) (subsequently referred to as “Premier database”). The Premier database is the largest hospital-based, service-level comparative database in the country. On an annual basis, the Premier database includes information on 40 million hospitalizations occurring in 615 U.S. hospitals. The Premier database excludes federally-funded (e.g., Veterans Affairs) hospitals. Otherwise, included hospitals are nationally representative based on bed size, geographic location, designation as urban versus rural, and teaching status (academic versus non-academic). The Premier database provides detailed information on the demographics, diagnoses, and resource utilization of de-identified hospitalized patients, as well as hospital and billing information (Table 1). We received data on all hospital discharges from January 2000 through December 2008 where rFVIIa use was reported. There were a total of 12,644 hospitalizations involving rFVIIa use (“cases”) that occurred in 235 hospitals within the Premier database hospital sample. In addition, we included 78 cases reported uses at these hospitals that did not result in a hospitalization (i.e., patients who received hospital-based outpatient treatment and comprised 0.6 percent of total cases).

Table 1. Data variables available from the Premier database.

Table 1

Data variables available from the Premier database.

Each hospitalization encounter has an associated statistical weight that allows extrapolation to the volume of hospitalizations estimated for the U.S. as a whole. These weights are based on the inverse of the sampling probabilities associated with each hospital in relationship to the universe of non-federal acute care hospitals, stratified by hospital characteristics, so that the aggregate of hospitalizations approximates the number and distribution of discharges from acute care, non-federal hospitals.

Data Sources for Included Studies

At the broadest level, we sought to identify all comparative studies evaluating off-label clinical applications of rFVIIa. Delineation of this evidence base is the foundation of the overview of studies of off-label rFVIIa use. The sub-sets of identified studies that were directed at the five selected indications of intracranial hemorrhage, trauma, liver transplantation, cardiac surgery, and prostatectomy could then also be used in the comparative effectiveness reviews of each of these indications. Because of the importance of evaluating the potential for harm caused by rFVIIa, we also searched for non-comparative studies that reported on harms for the selected indications. Finally, we also sought to identify relevant systematic reviews or meta-analyses.

We searched the following databases using search strings described in detail in Appendix A: PubMed, EMBASE, Cochrane Database of Systematic Reviews, ACP Journal Club, DARE, CCTR, CMR, HTA, NHSEED, and BIOSIS through August 4, 2009. In addition, a librarian at the Scientific Resource Center, who is an expert at searching the “grey literature” (sources other than published materials indexed in bibliographic databases such as Medline), searched regulatory sites (FDA, Health Canada, Authorized Medicines for EU), clinical trial registries (ClinicalTrials.gov, Current Controlled Trials, Clinical Study Results, and WHO Clinical Trials), abstract and conference proceedings (Conference Papers Index and Scopus), grant and federally funded research sites (NIH RePORTER and HSRPROJ), and other miscellaneous sources (Hayes, Inc. Health Technology Assessment and NY Academy of Medicine’s Grey Literature Index) and also contacted the authors of abstracts regarding subsequent full publications. Because we confirmed with the manufacturer (Novo Nordisk) that all of its trials were listed on the ClinicalTrials.gov website, we did not search the European Union’s EMEA database of trials. Finally, we reviewed the manufacturer’s website and files supplied by the manufacturer, searched the bibliographies of identified meta-analyses and systematic reviews, and contacted experts in the field to identify relevant publications.

Study Selection

We applied criteria for inclusion and exclusion based on the indication for rFVIIa use, outcome measures, and types of evidence specified in the Key Questions. We retrieved full-text articles of potentially relevant abstracts to which we re-applied inclusion and exclusion criteria.

Exclusion Criteria

Abstracts only. Results published only in abstract form were not included in our analyses.

Inappropriate intervention or outcome. We excluded studies of human (rather than recombinant) FVIIa and modified forms of rFVIIa that are still under development (e.g., pegylated forms). We also excluded studies that were performed on humans but in which the outcome measures were not deemed to be clinically relevant to efficacy or effectiveness. Examples include pharmacologic studies solely directed at metabolism or half-life or studies in healthy volunteers directed at monitoring parameters such as INR or thromboelastin time. We also excluded studies that were in vitro only (i.e., performed in a laboratory setting without translation to a patient).

Clinical indication for rFVIIa use. We excluded studies of on-label applications of rFVIIa in the U.S., which include use in hemophilia A or B with inhibitors and congenital factor VII deficiency. We also excluded studies of rFVIIa applied to populations of patients that are substantially similar to those for whom on-label indications have been approved. We sought expert input from an hematologist to define these patient populations, which were determined to include: Glanzmann’s thrombasthenia (for which rFVIIa is approved in Europe), hemophilia C, von Willebrand disease, Bernard-Soulier syndrome, Hermansky-Pudlak syndrome, and other congenital bleeding disorders.

Comparison Group of “Usual Care”

Key Questions 2–4 compare the effectiveness of rFVIIa with “usual care.” Based on our initial review of the literature and discussion with experts, we noted both evolution over time and regional or hospital differences in the parameters of “usual care” for almost all of the selected indications.68 Given these differences, we were concerned that the marginal benefit of rFVIIa when added onto “usual care” might vary according the standard of care employed. An example of such a situation might be when the baseline level of anticipated blood loss from a surgical intervention diminishes substantially over time, thus minimizing the marginal benefit of rFVIIa. For this reason, we built into our data abstraction tool a section for the prospective collection of data on the standard of care employed in each study.

Types of Evidence

Overview of comparative off-label studies. For Key Question 1, we limited our article selection to those with comparative study designs that would be expected to provide evidence on effectiveness, which included RCTs and comparative observational studies. Based on an initial review of the literature, we identified clinical categories of significant off-label rFVIIa use that were separate from the five indications that are the focus of the comparative effectiveness section of this report. We prospectively coded the studies identified in these categories which include bleeding related to: other liver disease, obstetrics/gynecology, hematology/oncology, other gastrointestinal bleeding, other surgery, and all other.

Comparative effectiveness reviews on Key Questions 2 through 4. For the comparative effectiveness review of the selected indications of intracranial hemorrhage, trauma, liver transplantation, cardiac surgery, and prostatectomy, we were especially concerned about capturing possible evidence of rare harms. For this reason, we expanded our article selection beyond comparative studies to non-comparative studies. While non-comparative data are open to many sources of bias, lack generalizeability, and thus are clearly a weaker source of evidence than comparative studies, they also may report rare events not identified in RCTs. Thus, they can still be an important source of information regarding harm. The non-comparative observational studies we chose to include were registries and cohorts with at least 15 patients, because we believe that the risk of bias (e.g., selective reporting) is likely increased in small reports; the selection of 15 patients as the cut-off point was arbitrary. Table 2 provides a schematic of which study types were used to conduct which assessments in this report.

Table 2. Use of different study types for each component of this comparative effectiveness review.

Table 2

Use of different study types for each component of this comparative effectiveness review.

Although five discrete indications were defined by the AHRQ Key Questions, our review made it evident that two of these indications were too heterogeneous for valid aggregation in the systematic review. Patients with traumatic bleeding can be divided into two distinct, albeit overlapping, groups: (1) those primarily with TBI and (2) those primarily with body trauma. Cardiac surgery, particularly as it pertains to rFVIIa use, also encompasses two populations of patients: (1) those with congenital heart defects requiring surgical correction beginning in infancy, and (2) those with cardiac problems as adults who require cardiac surgery to repair pathology generally resulting from degenerative or atherosclerotic processes. Based on these distinctions, we present a total of seven systematic reviews for each of the following rFVIIa indications: intracranial hemorrhage, body trauma, brain trauma, liver transplantation, adult cardiac surgery, pediatric cardiac surgery, and prostatectomy.

Independent determination of agreement on selection. To determine whether a given study met inclusion criteria, two authors independently reviewed the title, abstract, and full text (as necessary). Conflicts between reviewers were resolved through re-review and discussion. The overwhelming majority of conflicts regarding inclusion or exclusion related to the assignment of a specific reason for exclusion (i.e., there were often multiple reasons for exclusion of a given article but one needed to be assigned primacy), rather than disagreement over whether a particular article should be included or excluded.

Data Extraction

We extracted the following data from all of the included studies: study design; setting; patient characteristics; inclusion and exclusion criteria; detailed information about the dosing and administration of rFVIIa; numbers of patients eligible, enrolled, and lost to follow-up; details about outcome ascertainment; and information about “usual care” (because of difference over time and between regions/hospitals for the latter68).

Table 3 provides additional details on the important baseline characteristics and outcomes that were assessed for all studies and according to clinical indication. These were determined a priori through discussion with experts and review of the literature. At least two investigative team members, including one clinical and one non-clinical member, independently abstracted data onto pre-tested abstraction forms (Appendix E). Conflicts regarding data abstraction were resolved by re-review, discussion, and input from others, as necessary.

Table 3. Important baseline data and outcomes according to clinical indication.

Table 3

Important baseline data and outcomes according to clinical indication.

Quality Assessment of Individual Studies

Criteria

We used predefined criteria to assess the quality of included studies. We generated these criteria by performing a review of the literature and the AHRQ Effective Health Care Program’s Methods Reference Guide for Effectiveness and Comparative Effectiveness Reviews (“Methods Guide,” available at: http://effectivehealthcare.ahrq.gov/repFiles/2007_10DraftMethodsGuide.pdf) to identify articles on the study quality of RCTs and comparative observational studies. We found general agreement in the literature on key components of RCT study quality, but less agreement regarding comparative observational studies. Therefore, we felt that existing quality assessment tools were not complete in assessing key criteria for both RCTs and comparative observational studies and chose to consolidate the areas of criteria overlap, but leave distinct the criteria pertinent only to RCTS or observational studies, respectively. All criteria were culled from the exisiting literature of quality assessment tools or expert consensus statements on important quality criteria. For RCTs, the quality criteria were based on the Jadad score,69 studies of the methodologic quality of RCTs and its impact on treatment effect estimates,70–72 the CONSORT statement,73,74 and the Methods Guide. 75,76 For observational studies, the quality criteria were selected as those most consistently cited by experts in a published systematic review of quality tools,77 a Health and Technology Assessment Report on the evaluation of non-randomized studies,78 the STROBE statement,79,80 and the Methods Guide. 75,76 Table 4 indicates the quality domains and criteria we used to evaluate RCTs and comparative observational studies, respectively. Most quality criteria apply to both study types (six total—subject selection, comparability of groups, protections agains bias in outcomes, follow-up, and protections against bias in analyses, and conflict of interest). But three criteria were unique to either RCTs (methods of allocation) or observational studies (sample size and methods to characterize exposure). We gave certain criteria, indicated in bold in the table, the most weight in our qualitative evaluations, because our review of the literature indicated that the data and experts most agree on their importance to a determination of methodological quality. A study’s quality was not downgraded because of an identified conflict of interest (all of which were identified as manufacturer sponsorship or affiliation); rather, this information is discussed further in the methods below and was included in the results table on general characteristics of all included studies (Table 11).

Table 4. Quality domains and criteria for assessing RCTs and comparative observational studies.

Table 4

Quality domains and criteria for assessing RCTs and comparative observational studies.

Financial Support from the Manufacturer of rFVIIa

We evaluated the degree of financial support provided by the manufacturer, Novo Nordisk, according to the following schema: sponsorship of the study or the author or statistician being a Novo Nordisk employee was deemed to be substantial support and was labeled as “funding,” while other financial ties (e.g., being member of speakers bureau, getting some measure of research funding from Novo Nordisk, etc.) were noted as “affiliation.”

Assessment

Using the above criteria, two assessors independently assigned a quality grade to each study after coming to a qualitative determination of its overall methodological quality. The assigned categorical grade could be one of three, as suggested by the Methods Guide:76 good, fair, or poor (Table 5). Disagreements were resolved by discussion, with accommodation made for involvement of a third reviewer, if necessary, but this was never required. This grading system attempted to assess the comparative quality of studies that share the same study design (e.g., two RCTs)—but not the comparative quality of studies of different types (e.g., an RCT versus an observational study). For example, an RCT assigned a grade of “fair” was not judged to have equal methodological quality to a comparative observational study assigned a grade of “fair.” Rather, both study design and study quality were considered when evaluating the overall validity of a study.

Table 5. Criteria for assigning quality grade to individual included studies.

Table 5

Criteria for assigning quality grade to individual included studies.

Use of poor quality studies in the report. Using the above logic, all RCTS, including those of poor quality, were included in the evaluations of effectiveness for each clinical indication, whereas poor quality comparative observational studies were not reviewed in detail in the comparative effectiveness review but were used for qualitative sensitivity testing (by placing their findings in the context of those of the higher quality studies) and the harms analysis at the end of the report (Table 2).

Assessing the Strength of Evidence and Applicability for Each Key Question

Strength of Evidence

We applied the strength of evidence rating system recommended by the EPC working group on evidence grading.76 Specifically, two re viewers independently assessed the strength of evidence for the major outcomes in each of the Key Questions 2–4. To accomplish this, they first assigned individual scores to the four evidence domains defined further in Table 6: risk of bias, consistency, directness, and precision. Additional information on how the reviewers assessed the specific domain of “risk of bias” is included in Table 7: it was determined by both the type and aggregate quality of the studies on a given clinical outcome.

Table 6. The four domains of strength of evidence and their definition and scoring.

Table 6

The four domains of strength of evidence and their definition and scoring.

Table 7. Scoring the risk of bias for a given clinical outcome: determined by both the type and aggregate quality of studies.

Table 7

Scoring the risk of bias for a given clinical outcome: determined by both the type and aggregate quality of studies.

Based on the individual scores the reviewers assigned to the evidence domains, they then assigned an overall “strength of evidence rating” (defined in Table 8) to each clinical outcome. The reviewers’ domain scores and overall strength of evidence ratings were compared, and disagreements were resolved by discussion, with accommodation made for involvement of a third reviewer (an expert on strength of evidence grading), if necessary, and this was required in only one case. (See Appendix F for the strength of evidence evaluation form used by the reviewers.)

Table 8. Strength of evidence grading schema.

Table 8

Strength of evidence grading schema.

Applicability

Two independent assessors also evaluated the applicability to clinical practice of the total body of evidence within a given clinical indication in Key Questions 2–4. Disagreements were resolved by discussion, with accommodation made for involvement of a third reviewer, if necessary, but this was not required. Following the recommendations of the draft guidance document provided to EPCs,81 we used the PICOTS (population, intervention, comparator, outcome, timing, and setting) format to assess applicability. Table 9 describes the process and criteria we used for these assessments. On the basis of these criteria we rated the applicability of an area of evidence as poor, fair or good. Evidence for a given indication could only earn a “good” applicability rating when the applicability for each criterion within the PICOTS format was deemed to be good. The “fair” rating was broadest category and was achieved as long as the evidence for an indication did not earn a “poor” in more than one PICOTS criterion. A “poor” summary applicability rating was assigned if the evidence for an indication was deemed to “poor” in more than one PICOTS criterion.

Table 9. PICOTS criteria for assessing the applicability of evidence in Key Questions 2–4.

Table 9

PICOTS criteria for assessing the applicability of evidence in Key Questions 2–4.

Data Synthesis

Analysis of Premier Database on In-Hospital Use of rFVIIa

Data measures. We used SAS Version 9.1 (SAS Institute, Cary, NC) to analyze data from the Premier database on in-hospital use of rFVIIa. We classified hospitalizations into discrete, mutually exclusive indication categories based on the clinical information associated with each hospitalization, which included multiple diagnoses and procedures. For our sample of 12,644 hospitalizations, a total of 286,113 diagnosis and procedure codes were reported. We therefore constructed a descending hierarchy of ICD-9 codes to categorize each hospitalization (Table 10; also see Appendix C, Appendix Table 1, for a full listing of ICD -9 codes). This hierarchy started with the most relevant, most reliable, and most specific clinical diagnoses, followed successively by less relevant, less reliable, or less specific diagnoses. We also created diagnostic categories that corresponded to reported rFVIIa indications in the literature, including the five key indications identified for in-depth review in this report. The hierarchy was based on both primary and secondary ICD-9 diagnostic codes, as well as ICD-9 procedure codes. A hospitalization was assigned to a diagnostic category based on the ICD-9 code that placed it in the highest category within the descending hiearchy.

Table 10. Diagnostic hierarchy for analysis of Premier database.

Table 10

Diagnostic hierarchy for analysis of Premier database.

Because of our focus on off-label use, our top priority diagnoses in this hierarchy were the FDA approved indications of Hemophilia A and B, followed by those unapproved indications that are similar to hemophilia or approved in other nations. If these diagnoses were noted, the hospitalization was classified into that category regardless of whether other prominent potential indications were noted during the same hospitalization. In turn, hospitalizations not classified as hemophilia and related conditions were categorized as brain trauma if any diagnosis indicated a non-iatrogenic cause of brain injury. Those cases not classified to this indication were then evaluated as to the presence of any diagnosis indicating a non-iatrogenic cause of injury, thus creating a category of trauma in the absence of head injury. This same process was used successively for the categories of intracranial hemorrhage, brain surgery, cardiovascular surgery, obstetrics, neonatal conditions, aortic aneurysm, prostate surgery, other vascular surgical procedures, liver transplantation, liver biopsy, variceal bleeding, other liver disease, other sources of gastrointestinal bleeding, other hematologic conditions, pulmonary conditions and procedures, cancer-associated use, all other surgical procedures, and, finally, other diagnoses not involving surgery. We further divided the cardiovascular surgery category into adult and pediatric populations. We also divided the “other hematologic conditions” category into two very different groups. We gave high priority to conditions that represent distinct and usually isolated defects in the clotting process, including other congenital factor deficiencies and Glanzmann’s thrombasthenia. We gave relatively low priority to less specific conditions that are less often isolated defects in clotting, but more likely the end product of other pathological conditions (particularly a variety of secondary thrombocytopenias). Where feasible, we captured the proximal causes of these coagulation problems earlier in the hierarchy, as with traumatic bleeding causing consumptive coagulopathy or the disruption of clotting produced by liver disease. We performed several sensitivity analyses to determine the impact of hierarchy order on these categorizations by moving indications up or down in the hierarchy to determine whether this changed their reported frequency.

Unit of analysis. The unit of analysis was any hospital “case” of rFVIIa use —defined as any application during a patient hospitalization. We favored the use of this case-based unit of analysis because of its advantages, particularly because it captures the medical decision-making component of care about whether to use or not use rFVIIa for a given patient. Alternative methods of analyzing rFVIIa use by dosing were also examined, including the number of times rFVIIa was dispensed by the inpatient pharmacy and the total dose of rFVIIa dispensed. However, we determined that these strategies of examining dosing had significant drawbacks, including: (1) possible discrepancies between dispensed rFVII and the amount actually administered to the patient, (2) lack of consistent hospital coding of rFVIIa dispensing (e.g., missing or variable reporting of units (such as milligrams dispensed versus vials dispensed)), and (3) outlier cases. Examination of the dosing information on outlier cases indicates substantial variation in the dose of rFVIIa dispensed during individual hospitalizations with some cases being dispensed a fraction of a 1.2 mg vial while others received more than a hundred vials. Individual cases with large aggregate dosages included both hemophilia and non-hemophilia cases. Analyses by dosing, rather than cases of use, could have different findings.

Lack of denominator. The Premier database does not provide information on patients with similar clinical indications for rFVIIa use but for whom the drug was not given, so we were unable to determine the overall denominator of potential rFVIIa usage (i.e., total number of patients eligible for use) for a given clinical indication.

Statistical analysis. Our statistical analysis focused on documenting annual trends in national estimates of in-hospital rFVIIa cases of use. We also analyzed and plotted aggregate rFVIIa use by quarter to characterize the most recent trends. To characterize patterns of use by indication, we produced a simple cross-tabulation of indication category by year. The data presented below combine several of the categories developed within the hierarchy that were not frequent, although we retained all five indication categories subjected to detailed systematic review in this report. This allowed us to gauge whether the volume of real-world use of rFVIIa for each of these indications warranted their selection by AHRQ for such in-depth examination. We reported characteristics of the population receiving rFVIIa, specifically age, gender, and in-hospital mortality rates, to allow for qualitative comparisons to the populations represented in the comparative studies. We also examined the hospital characteristics of teaching hospital status and regional location.

We employed statistical weights associated with each hospital by quarter. These weights allow for nationally representative projections of hospital activities. The weights are derived by Premier Inc. based on the relationship of the Premier hospital sample to the universe of non-federal, acute care hospitals. The statistical weights varied from around 10 in 2000 to around 5.5 in 2008 as a function of the increasing number of hospitals included in the Premier database. To weight the few non-hospitalized patient encounters, we used the corresponding weights for the same quarter and hospital.

Analysis of Comparative Studies

Issues of heterogeneity. We first addressed issues of heterogeneity at the level of the Key Questions. We determined that studies of trauma needed to be separated into those on body trauma and those on brain trauma because the challenges faced in managing these patients, while overlapping, are distinct enough to warrant separate evaluation. Similarly, we determined that studies of cardiac surgery needed to be separated into those in pediatric patients (generally infants requiring correction of congenital cardiac abnormalities) and those in adult patients (generally patients in the sixth to eighth decades of life with cardiac problems related to age-related degeneration or dysfunction and with very different underlying thromboembolic risks than infants). For the remaining indications, we discuss issues of heterogeneity between studies within the effectiveness review of the given indication.

Statistical analysis. We considered studies eligible for meta-analysis regardless of study type (RCT versus comparative observational) as long as they met the quality criteria of being good or fair, and as long as they had similar interventions and patient populations in terms of baseline clinical characteristics. We performed meta-analyses when there were sufficient studies to warrant meta-analytic evaluations. We defined sufficient studies (for a given indication) as a total of at least two studies of fair or better quality, including at least one study of good quality.

Intervention and control arms were compared for continuous variables (e.g., hematoma volume for ICH patients) using a random effect model for standardized mean difference effect size. Dichotomous outcomes (e.g., mortality and thromboembolic events) were compared using a random effects model with two different effect size metrics, the risk difference and the arcsine standardized mean difference,82 which provided a sensitivity analysis for the use of different metrics. The former, the risk difference, was chosen as a measure of effect size for the report because it is easy to interpret and the risks for different outcomes were similar across studies, such that the disadvantages of using the risk difference approach to estimate effect size (e.g., as compared to other common metrics such as the odds ratio) we re minimized. The arcsine metric is a less well known approach but has the advantage of generating less biased estimates of the difference between treatment and control arms when there are sparse data or multiple outcomes with zero observations (e.g., zero deaths) for proportions and dichotomous responses.82 It is calculated as:

arcsindifference(pT,pC)=arcsin(pT)-arcsin(pC)pT-pC

We performed formal assessments of heterogeneity using the Q statistic for heterogeneity (and I2 statistic as appropriate) and performed all meta-analytic calculations using the R statistics package (Version 2.10.0, “meta” and “rmeta” packages).

For the intracranial hemorrhage indication, there were special statistical considerations, and we made several a priori decisions regarding the statistical analyses to be performed. Because there were indications in the literature regarding a possible dose response relationship between rFVIIa and certain outcomes (e.g., thromboembolic events) and multiple doses of rFVI were analyzed in each RCT, we chose to analyze the data according to low, medium, and high dose rFVIIa groups, defined as less than or equal to 40 μg/kg, greater than 40 but less 120 μg/kg, and at least 120 μg/kg, respectively. However, in all of the RCTs, the different levels of treatment dosage were compared to a common control. In addition, some studies did not contain all levels of the treatment dosage. Because of these complexities, we applied methods developed by Olkin et al82 to analyze this kind of data when generating the summary effect sizes. Second, because there were suggestions in the literature of a possible association between rFVIIa and arterial thromboembolic events but not venous events and both types of data were available to us from the ICH RCTs, we chose to analyze arterial and venous thromboembolic events separately for this indication. In contrast, we evaluated all thromboembolic events together for the remainder of the indications. Finally, while the summary effect sizes for the intracranial hemorrhage analyses are indeed accurate, their graphical representation using forest plots is complicated by their use of a common control for the different treatments dosages, so should be considered an aide to interpretation rather than a strict representation of the underlying metrics employed.

Analysis of Non-Comparative Studies for Data on Harm

To evaluate the evidence of harm of rFVIIa in the non-comparative study literature included in our review (registries and non-comparative cohorts with 15 or more patients), we report the unadjusted summary event rates for mortality and thromboembolic events from the non-comparative studies, the intervention arms of the comparative studies, and the Premier database.

Peer Review and Public Commentary

A draft of this Evidence Report was reviewed by experts in hematology, trauma surgery, liver transplantation, cardiac surgery, and prostatectomy (Appendix D). These experts were either directly invited by the EPC or offered comments through a public review process. The draft report was also reviewed by staff from the Scientific Resource Center at Oregon Health Science University and AHRQ staff. Revisions to the draft were made, where appropriate, based on the reviewer comments. However, the findings and conclusions are those of the authors who are responsible for the contents of the report.

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (3.0M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...