U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Gierisch JM, Nieuwsma JA, Bradford DW, et al. Interventions To Improve Cardiovascular Risk Factors in People With Serious Mental Illness [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2013 Apr. (Comparative Effectiveness Reviews, No. 105.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Interventions To Improve Cardiovascular Risk Factors in People With Serious Mental Illness

Interventions To Improve Cardiovascular Risk Factors in People With Serious Mental Illness [Internet].

Show details

Methods

The methods for this comparative effectiveness review follow those suggested in the AHRQ “Methods Guide for Effectiveness and Comparative Effectiveness Reviews” (available at www.effectivehealthcare.ahrq.gov/methodsguide.cfm; hereafter referred to as the Methods Guide).40 The main sections in this chapter reflect the elements of the protocol established for the systematic review; certain methods map to the PRISMA checklist.41

Topic Refinement and Review Protocol

During the topic refinement stage, we solicited input from Key Informants representing clinicians (psychiatry, psychology, mental health education and treatment), patient advocates, scientific experts, and payers to help define the Key Questions (KQs). The KQs were then posted for a 4-week public comment period, and the comments received were considered in the development of the research protocol. We next convened a TEP comprising clinical, content, and methodological experts to provide input in defining populations, interventions, comparisons, and outcomes, as well as identifying particular studies or databases to search. The Key Informants and members of the TEP were required to disclose any financial conflicts of interest greater than $10,000 and any other relevant business or professional conflicts of interest. Any potential conflicts of interest were balanced or mitigated. Key Informants and members of the TEP did not perform analysis of any kind or contribute to the writing of the report. Members of the TEP were invited to provide feedback on an initial draft of the review protocol which was then refined based on their input, reviewed by AHRQ, and posted for public access at the AHRQ Effective Health Care Web site.42

Literature Search Strategy

Sources Searched

To identify the relevant published literature, we searched MEDLINE®, Embase®, PsycINFO®, and the Cochrane Database of Systematic Reviews. Where possible, we used existing validated search filters (such as the Clinical Queries Filters in PubMed®). An experienced search librarian guided all searches. Exact search strings and dates are included in Appendix A. We supplemented the electronic searches with a manual search of citations from a set of key primary and review articles.4382 The reference lists for these articles were manually reviewed and cross-referenced against our library of search results, and additional potentially relevant citations were retrieved for screening. All citations were imported into an electronic database (EndNote® X4; Thomson Reuters, Philadelphia, PA).

We used two approaches to identify relevant gray literature: (1) a request for scientific information packets submitted to drug manufacturers and (2) a search of trial records listed in ClinicalTrials.gov (see Appendix A for search date and exact search terms). The search of ClinicalTrials.gov was also used as a mechanism to ascertain publication bias by identifying completed but unpublished studies. During peer and public review of the draft report, we updated the database searches and included any eligible studies identified either through that search or through suggestions from peer and public reviewers.

Inclusion and Exclusion Criteria

The PICOTS criteria used to screen articles for inclusion/exclusion at both the title-and-abstract and full-text screening stages are detailed in Table 2. Given the large number of interventions considered, the higher risk of bias, and complexity of identifying relevant observational studies, we restricted our review to randomized controlled trials.

Table 2. Inclusion and exclusion criteria.

Table 2

Inclusion and exclusion criteria.

Study Selection

Using the prespecified inclusion and exclusion criteria described in Table 2, two investigators independently reviewed titles and abstracts for potential relevance to the KQs. Articles included by either reviewer underwent full-text screening. At the full-text screening stage, two investigators independently reviewed each article to determine if it met eligibility criteria, and indicated a decision to “include” or “exclude” the article for data abstraction. When the paired reviewers arrived at different decisions about whether to include or exclude an article, or about the reason for exclusion, they reconciled the difference through review and discussion, or through a third-party arbitrator if needed. Articles meeting our eligibility criteria were included for data abstraction. Relevant review articles and meta-analyses were flagged for manual searching of references and cross-referencing against the library of citations identified through electronic database searching.

For citations retrieved by searching the gray literature, the above-described procedures were modified such that a single screener initially reviewed all search results; final eligibility of citations for data abstraction was determined by duplicate screening review. All screening decisions were made and tracked in a DistillerSR database (Evidence Partners Inc, Manotick, ON, Canada).

Data Extraction

The investigative team created data abstraction forms and evidence table templates for abstracting data for the KQs. Based on clinical and methodological expertise, a pair of investigators was assigned to abstract data from each eligible article. One investigator abstracted the data, and the second reviewed the article and the accompanying completed abstraction form to check for accuracy and completeness. Quality ratings and efficacy–effectiveness ratings (see below) were completed independently by two investigators. Disagreements were resolved by consensus, or by obtaining a third reviewer’s opinion if consensus could not be reached. To aid in both reproducibility and standardization of data collection, researchers received data abstraction instructions directly on each form created specifically for this project within the DistillerSR database.

We designed the data abstraction forms for this project to collect the data required to evaluate the specified eligibility criteria for inclusion in this review, as well as demographic and other data needed for determining outcomes. We gave particular attention to describing the details of the interventions (e.g., pharmacotherapy used, intensity of behavioral interventions), patient characteristics (e.g., SMI diagnosis), and comparators that may be related to outcomes. Data necessary for assessing quality and applicability, as described in the Methods Guide,40 were also abstracted. When critical data were missing, we contacted study authors. Of the seven authors contacted, five replied with the requested information.

We adapted a previously published efficacy–effectiveness instrument (Appendix B) to assess eight dimensions:83 (1) setting/practitioner expertise, (2) restrictiveness of eligibility criteria, (3) health outcomes, (4) flexibility of the intervention and study duration, (5) assessment of adverse events, (6) adequate sample size for important health outcomes, (7) intention-to-treat approach to analyses, and (8) identity of the comparison intervention. We developed definitions for each dimension that were specific to the literature reviewed. We rated each of the eight dimensions as effectiveness (score=1) or efficacy (score=0); scores on each dimension were summed and could range from 0–8. Studies were categorized as efficacy (0–2), mixed efficacy–effectiveness (3–5) or effectiveness (6–8) based on summed scores. Simple agreement between investigator pairs was 78 percent and unweighted kappa 0.57, indicating moderate agreement beyond chance for efficacy–effectiveness categories.

Before they were used, abstraction form templates were pilot-tested with a sample of included articles to ensure that all relevant data elements were captured and that there was consistency/reproducibility between abstractors. Forms were revised as necessary before full abstraction of all included articles. Some outcomes were reported only in figures. In these instances, we used the web-based software, EnGauge Digitizer (digitizer.sourceforge.net/) to convert graphical displays to numerical data. Appendix C lists the elements included in the data abstraction forms.

Quality Assessment of Individual Studies

We evaluated the quality of individual studies using the key criteria for RCTs described in the Methods Guide.40 Criteria of interest included methods of randomization and allocation concealment, similarity of groups at baseline, extent to which outcomes were described, blinding of subjects and providers, blinded assessment of the outcome(s), intention-to-treat analysis, differential loss to followup between the compared groups or overall high loss to followup, and conflicts of interest.

To indicate the summary judgment of the quality of the individual studies, we used the summary ratings of good, fair, or poor based on their adherence to well-accepted standard methodologies and adequate reporting (Table 3). For each study, two investigators independently assigned a summary quality rating; disagreements were resolved by consensus or by discussion with a third investigator if agreement could not be reached. Quality ratings were assigned separately for “hard” outcomes (e.g., mortality, laboratory measurements) and all other outcomes (e.g., health-related quality of life); thus, a given study may have been categorized differently for two individual outcomes reported within that study.

Table 3. Definitions of overall quality ratings.

Table 3

Definitions of overall quality ratings.

Data Synthesis

We began by summarizing key features of the included studies for each KQ. To the degree that data were available, we abstracted information on study design; patient characteristics; clinical settings; interventions; and intermediate, final, and adverse effects outcomes. We then determined the feasibility of completing a quantitative synthesis (i.e., meta-analysis). Feasibility depended on the volume of relevant literature (≥3 studies), conceptual homogeneity of the studies, and completeness of the reporting of results. When a meta-analysis was appropriate, we used random-effects models to quantitatively synthesize the available evidence. For other outcomes we analyzed the results qualitatively. The outcomes amenable to meta-analysis were continuous; we therefore summarized these outcomes by a weighted difference of the means when the same scale (e.g., weight) was used and a standardized mean difference when the scales (e.g., health-related quality of life) differed across studies. We standardized results presentation such that a negative value indicates a greater intervention effect. When needed, we converted reported outcomes to a common unit (e.g., cholesterol from mmol/L to mg/dl). We present summary estimates, standard errors, and confidence intervals in our data synthesis.

We organized our analyses by KQ. When a single study reported outcomes relevant to multiple KQs, it was included in the analyses for each question. For example, a study evaluating a weight-loss intervention that specified weight as the primary outcome—but which also reported effects on glucose and lipid parameters—was described in each relevant KQ. When a study was designed to intervene on more than one CVD risk factor (e.g., metabolic syndrome), it was summarized in KQ 4. We specified, a priori, weight control as measured by change in kilograms (or pounds); hemoglobin A1c (HbA1c) as the preferred measure of glucose control since it reflects average glucose values over a 3-month interval; and total and LDL cholesterol as measures of lipid control. For adverse effects, we report significant worsening of psychiatric status and discontinuations due to adverse effects. Interventions were categorized as behavioral, pharmacological, peer or family support, or multicondition (e.g., specifically targeting more than one condition such as smoking cessation and weight loss). Drug classes were psychotropics, neurologics, metformin, antihistamines, nutritionals (i.e., carnitine), and switching between antipsychotic medications.

We tested for heterogeneity using graphical displays and test statistics (Q statistic), while recognizing that the ability of statistical methods to detect heterogeneity may be limited.84 The I2 describes the percentage of total variation across studies due to heterogeneity rather than to chance. Heterogeneity was categorized as low, moderate, or high based on I2 values of 25 percent, 50 percent, and 75 percent respectively.84 When there were sufficient studies, we explored heterogeneity in study effects by using subgroup analyses. When there were sufficient studies (n ≥10), we assessed for publication bias using funnel plots and test statistics.85 All analyses were conducted using Comprehensive Meta-Analysis software (Version 2; Biostat, Englewood, NJ).

Strength of the Body of Evidence

The strength of evidence for each KQ and outcome was assessed using the approach described in the Methods Guide.40,86 In brief, the approach requires assessment of four domains: risk of bias, consistency, directness, and precision (Table 4).

Table 4. Strength of evidence required domains.

Table 4

Strength of evidence required domains.

Additional domains were used when appropriate: coherence, and publication bias. These domains were considered qualitatively, and a summary rating of high, moderate, or low strength of evidence was assigned after discussion by two reviewers. In some cases, high, moderate, or low ratings were impossible or imprudent to make; for example, when no evidence was available or when evidence on the outcome was too weak, sparse, or inconsistent to permit any conclusion to be drawn. In these situations, a grade of insufficient was assigned. This four-level rating scale consists of the following definitions:

  • High—High confidence that the evidence reflects the true effect. Further research is very unlikely to change our confidence in the estimate of effect.
  • Moderate—Moderate confidence that the evidence reflects the true effect. Further research may change our confidence in the estimate of effect and may change the estimate.
  • Low—Low confidence that the evidence reflects the true effect. Further research is likely to change the confidence in the estimate of effect and is likely to change the estimate.
  • Insufficient—Evidence either is unavailable or does not permit estimation of an effect.

Applicability

We assessed applicability across our KQs using the method described in the Methods Guide.40,87 In brief, this method uses the PICOTS format as a way to organize information relevant to applicability. The most important issue with respect to applicability is whether the outcomes are different across studies that recruit different populations (e.g., age groups, exclusions for comorbidities) or use different methods to implement the interventions of interest; that is, important characteristics are those that affect baseline (control-group) rates of events, intervention-group rates of events, or both. We used a checklist to guide the assessment of applicability (Appendix C). We used these data to evaluate the applicability to clinical practice, paying special attention to study eligibility criteria, demographic features of the enrolled population in comparison with the target population, characteristics of the intervention used in comparison with care models currently in use, and clinical relevance and timing of the outcome measures. We summarized issues of applicability qualitatively.

Peer Review and Public Commentary

The peer review process is our principal external quality-monitoring device. Nominations for peer reviewers were solicited from several sources, including the TEP and interested Federal agencies. Experts in psychiatry, mental illness, chronic medical conditions, systematic review methodology, pharmacoepidemiology of SMI, public health, and integration of mental health and primary care, along with individuals representing stakeholder and user communities, were invited to provide external peer review of this draft report; AHRQ and an associate editor also provided comments. The draft report was posted on AHRQ’s Web site for public comment for 4 weeks, from July 19, 2012, to August 17, 2012. We have addressed reviewer comments, revising the report as appropriate, and have documented our responses in a disposition of comments report available on the AHRQ Web site. A list of peer reviewers is given in the preface of this report.

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (2.0M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...