U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Jadad AR, Boyle M, Cunningham C, et al. Treatment of Attention-Deficit/Hyperactivity Disorder. Rockville (MD): Agency for Healthcare Research and Quality (US); 1999 Nov. (Evidence Reports/Technology Assessments, No. 11.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Treatment of Attention-Deficit/Hyperactivity Disorder

Treatment of Attention-Deficit/Hyperactivity Disorder.

Show details

2Methodology

Research Questions and Scope of Work

A multidisciplinary research team was assembled, with participation of members of the nominating organizations (the subcommittee of the AAP on ADHD and the Deputy Medical Director of the APA), consumer groups, local experts, and research staff (Appendix B). The local experts and research staff constituted a "local research team." All others were regarded as partners. To identify the research questions for this Task Order, the group engaged in multiple consultations and considered the findings of a systematic review of published systematic reviews and meta-analyses (Appendix C). This process led to the identification of the following questions to be addressed by the evidence report:

  • What is the evidence from comparative studies on the effectiveness and safety, both short and long term, of pharmacological and nonpharmacological interventions for ADHD in children and adults?
  • Are combined interventions more effective than individual interventions?

An extensive period of time at the outset of the report process was spent consulting with members of the nominating partner groups, AHRQ, and the research team in order to refine the general questions. To answer these questions and avoid duplication of work, make efficient use of the resources available, and ensure maximum added clinical value, the scope of the evidence report focused on the following seven categories of research studies:

  • Studies with drug-to-drug comparisons of pharmacological interventions.
  • Placebo-controlled studies evaluating the effect of tricyclic antidepressants.
  • Studies comparing pharmacological with nonpharmacological interventions (drug vs. nondrug studies).
  • Studies evaluating the effect of long-term therapies (>12 weeks).
  • Studies evaluating therapies for ADHD in adults (>18 years of age).
  • Studies evaluating therapies given in combination.
  • Studies evaluating adverse effects of pharmacological interventions.

The above categories will be described in detail in the Methodology chapter. A separate category of studies comparing the short-term effect (<3 months) of stimulants with placebos was not included in the scope of work for this Task Order. Individual studies comparing stimulants with placebo were included in this report only if they met the inclusion criteria for any of the other categories. The main reason behind this decision was that, as mentioned above, another group had been commissioned to cover this issue. Most of the members of the research team considered that even if an additional review was conducted, the conclusions would not be different from those obtained by the researchers at University of British Columbia, given that most studies to date have shown consistently that stimulant medication improves core symptoms, at least in the short term (Kavale, 1982; Ottenbacher and Cooper, 1983; Thurber and Walker, 1983). Therefore, it was concluded that a new systematic review on this topic would have added little to the current state of knowledge and would have consumed all the resources available to answer other questions. Instead, the team decided to focus attention on studies comparing the long-term effects (>12 weeks) of stimulants with placebo and complement the findings with those of previous systematic reviews and the concurrent work conducted at the University of British Columbia. In summary, focusing the scope of work on the seven categories of studies described above was the strategy used by the research team to ensure maximum added value, avoid duplication of work, and present the unique contributions of this Task Order to current knowledge on the treatment of ADHD in children and adults.

Inclusion and Exclusion Criteria

Citations of individual studies were regarded as potentially eligible and selected for further evaluation if they met the following generic criteria:

  • They focused on the treatment of ADHD in humans.
  • They were published, in any language, in peer-reviewed journals as a full report.

Hard copies were obtained for all potentially eligible studies. They were independently assessed by two members of the research team who then decided, by consensus, whether to include the study in one or more of the categories covered in the evidence report. If the studies included conditions other than ADHD, they were included if separate analyses for patients with ADHD were provided.

With the exception of the systematic review of adverse effects, the focus was placed on evidence provided by randomized controlled trials (RCTs), the simplest and more powerful research design to evaluate the efficacy and effectiveness of interventions. The specific inclusion criteria for each of the categories listed below were agreed to by all the members of the research team.

Drug-to-Drug Comparisons

A study was included in this category if it was an RCT that met all the generic eligibility criteria of this Task Order and if it included at least one of the following head-to-head comparisons:

  • Stimulants (methylphenidate [MPH], dextroamphetamine [DEX], or pemoline) vs. stimulants. Trials comparing the same drug (e.g., MPH vs. MPH) were included if different formulations (e.g., sustained-release vs. regular) or different enantiomers (e.g., l-MPH vs. d-MPH) were compared.
  • Stimulants (as above) vs. tricyclic antidepressants (desipramine, imipramine, or amitriptyline).
  • Stimulants (as above) vs. clonidine, buproprion, or selective serotonin-reuptake inhibitors (fluoxetine or paroxetine).

Tricyclic Antidepressants vs. Placebo

In this category, a study was included if it was an RCT that met all the generic eligibility criteria of this Task Order and if it included a comparison of placebo with amitriptyline, imipramine, or desipramine.

Drug vs. Nondrug Studies

In this category, a study was included if it was an RCT that met all the generic eligibility criteria of this Task Order and if:

  • One of the study arms included only a stimulant drug (MPH, DEX, or pemoline).
  • One or more of the control arms included other modes of intervention such as behavior modification, dietary interventions, or other psychosocial intervention.

This category did not include comparisons of nondrug interventions against placebo or against other nondrug interventions.

Combination Therapies

In this category, a study was included if it was an RCT that met all the generic eligibility criteria of this Task Order and if:

  • One of the study arms included two or more interventions given in combination.
  • One of the study arms included a stimulant (MPH, DEX, or pemoline).

This section aimed to answer the second primary question of this evidence-based report: Are combined interventions more effective than therapy with single-stimulant medication?

Long-Term Therapy

ADHD is a persistent disorder, and many of its adverse consequences such as antisocial disorder and substance abuse do not arise until many years after the condition is first detected and treated. Consequently, treatments have to be of sufficient intensity and duration to have an impact on these adverse outcomes and followup has to be of sufficient duration to be able to detect several of the most important treatment outcomes. Few long-term followup studies of the effectiveness of treatment for ADHD have been conducted even in the area of medication trials. To be able to compare the differential impact of short- and long-term medication trials, an arbitrary distinction was made between studies that provided treatment for fewer than or more than 12 weeks. This criterion permitted identification of a sufficient number of "long-term" studies from which to draw preliminary conclusions. This distinction does not, of course, imply that 12 weeks of treatment defines optimal extended treatment.

Against this background, a study was included in the review if it met all the generic eligibility criteria of this Task Order and if treatments were evaluated under randomized conditions for 12 weeks or more in all the study arms. Studies were excluded if the treatments were given for fewer than 12 weeks, even if outcome measurements were obtained more than 12 weeks after randomization.

Treatment of ADHD in Adults

In this category, a study was included if it met all the generic eligibility criteria of this Task Order and if treatments were evaluated in patients older than 18 years of age.

Adverse Effects

In this category, a study was included if it was an RCT that met all the generic eligibility criteria of this Task Order and if it provided data on at least one of the adverse effects of interest in all the study arms. The adverse effects of interest were the following:

  • For stimulants: changes in appetite, effects on growth (both in terms of height and weight), somatic effects (headaches, abdominal pain, sleep dysfunction), mood changes (crying, irritability, sadness/depression, withdrawal), motor tics, and drug addiction.
  • For pemoline only: liver toxicity.
  • For tricyclic antidepressants: cardiac arrythmias.

Non-RCT studies were included if they met all of the other inclusion criteria and if they evaluated adverse effects associated with treatment for more than 12 weeks and included more than 10 patients. RCTs are usually of insufficient duration to detect rare adverse events or those that take long periods of time to become apparent (Levine, Walter, Lee et al., 1994; Sackett, Richardson, Rosenberg et al., 1997).

Literature Search

Citations of potentially eligible studies were identified through a systematic search of:

  • MEDLINE (from 1966), CINAHL (from 1982) and HEALTHStar (from 1975), PsycINFO (from 1984), and EMBASE (from 1984) using the search strategy described in Appendix D. All databases were searched from the date of their release to November of 1997. The strategy was designed for search of MEDLINE and was modified to meet the specific features of CINAHL, EMBASE and PsycINFO.
  • The Cochrane Library (issue 4, 1997).
  • The reference lists any eligible article identified in any of the above sources.
  • Web sites of organizations funding research on the treatment of ADHD.
  • Files of members of the research team and partner organizations.

Data Extraction

Data extraction forms were especially developed and tested for this project (Appendix E). After consultation with all the members of the local research team, the partners, and the TOO, the forms were approved for content. Two reviewers extracted data independently from each of the full reports. Any differences were resolved by consensus and by referring to the information in the original report. Any differences that could not be resolved by the two reviewers who extracted the data were resolved by the Task Order Leader or by a member of the local research team designated by him.

The original reports were not masked because an empirical methodological study showed results indicating that masking was time consuming and did not have an important impact on the results of systematic reviews (Berlin, 1997).

The information extracted addressed 41 different aspects (Appendix E, pages 1 to 5) of the studies, which were selected by the team members a priori. Of these, 25 elements were regarded as essential to judge the validity of each of the studies. The selection of five of these elements (questions 37 to 41 in Appendix E) was supported by empirical methodological evidence from studies that showed a direct relationship between these elements and the likelihood of bias in RCTs. The remaining 20 elements were selected by consensus among all the research team members as clinically important but are not based on empirical methodological research.

Elements Extracted From the Studies That Are Supported by Research Evidence on Bias

These elements included questions 37 to 41 in Appendix E. Questions 37 to 39 describe all the items of the only validated scale to assess the methodological quality of RCTs (Jadad, Moore, Carroll et al., 1996) (Appendix F). This scale assesses whether the studies describe randomization, double blinding, and withdrawals. It produces a minimum score of 0 points and a maximum score of 5 points. The higher the score, the better the methodological quality of the RCT (Jadad, Moore, Carroll et al., 1996). Studies that are given scores of <2 points, including RCTs in mental health, have been shown to exaggerate the estimates of the effects of interventions, on average, by more than 30 percent (Moher, Pham, Jones et al., 1998). Measured by this scale, even when the trials are not double blind, as was the case in some trials in this review, they can still be awarded 3 points if the reports include a description of appropriate methods to generate the randomization sequence (2 points) and a detailed account of withdrawals and dropouts (1 point).

The likelihood of bias in the RCTs was also assessed by determining whether allocation of individuals to the different study groups had been concealed until after consent was obtained from prospective participants to be part of the study (question 40 in Appendix E). Studies in which allocation was unclear or inadequately concealed have been shown to exaggerate the estimates of the effects of interventions, on average, by more than 35 percent (Moher, Pham, Jones et al., 1998; Schulz, Chalmers, Hayes et al., 1995). It is important that the difference is recognized between biases that are the result of lack of allocation concealment and biases that arise from lack of blinding. Allocation concealment helps to prevent selection bias, protects the randomization sequence before and until the interventions are given to study participants, and can always be implemented. Blinding helps prevent ascertainment bias, protects the randomization sequence after allocation, and cannot always be implemented (Schulz, Chalmers, Hayes et al., 1995).

In addition, information on association between the authors and investigators and the pharmaceutical or related industry was also sought (question 41 in Appendix E). It has been shown that reports of trials sponsored by pharmaceutical companies are more likely to favor the experimental intervention over controls than trials not sponsored by pharmaceutical companies (Bero and Rennie, 1996; Cho and Bero, 1996).

Clinically Relevant Elements Extracted From the Studies That Are Not Supported by Empirical Methodological Studies

The 20 remaining elements referred to patient characteristics (questions 18, 24, 25, 26, and 32 in Appendix E), sampling issues (questions 14, 15, 17, 19, 21, 22, and 31), diagnosis (questions 28, 29, and 30) and treatment issues (questions 23, 34, 35, and 36). The following is a brief description of these elements and the theoretical effects that they may have on the validity and applicability of research on the treatment of ADHD:

Patient Characteristics

The outcome of the interventions may vary as a function of age (question 24 in Appendix E), gender (question 25), intellectual abilities (question 26), and family characteristics (question 32).

In addition, findings in a specific ethnic group may not apply to others (question 18). Professionals must determine whether the patients in the sample employed in a study are comparable to those patients to whom the intervention will be applied.

Sampling Issues

The Number of Eligible Patients (question 14 in Appendix E): This constitutes the sampling frame of subjects available for study. If more subjects are eligible than needed, probability sampling should be used to identify study participants so that the study findings are generalizable to a defined group. In addition to this, the investigator should report on the response to enlistment, including the number of subjects who decline and accept the invitation to participate. The ratio of these two groups is a good indicator of the level of acceptability associated with the treatment options. In studies where a high percentage of subjects decline, the possibility exists that treatment is only applicable to a very small group of patients with distinctive features.

  • Number of Patients Randomized and Analyzed (questions 15, 17): The proper denominator for evaluating treatment effectiveness is the number of patients randomized. Subject withdrawals and losses to followup often lead to fewer subjects for analysis. The magnitude and distribution of these losses across treatment groups will have important implications for understanding treatment acceptability and effectiveness. In the absence of this information, study results are virtually uninterpretable.
  • Treatment Setting (question 19): The study location (treatment setting) provides the context for carrying out an investigation. Embedded in context are many features (e.g., professional affiliation of the investigators; reputation, acceptability, and accessibility of the facility) that may have a bearing on both the characteristics of the subjects within a study catchment area and the acceptability and effectiveness of treatment.
  • Sample Origin (question 31): Subjects for treatment studies may come from clinic (inpatient and outpatient) and/or nonclinic populations. The origin of the sample is likely to have a strong bearing on the severity and complexity of the cases being treated and their prognosis. This influence may be independent of the diagnostic criteria used to determine subject eligibility. This information is important for assessing the applicability and generalizability of study findings.
  • Inclusion and Exclusion Criteria (questions 21, 22): Inclusion and exclusion criteria are used to determine patient eligibility for study enlistment. These criteria define the "target population" or the subjects for whom the study results are intended to apply. In the absence of these criteria, it is very difficult, if not impossible, to assess the limits of generalizability for a particular study. It also leaves the user of clinical studies in a quandary about the extent to which study findings apply to his or her patients.

Diagnosis-Related Issues

Diagnostic Model Used (question 28 in Appendix E): Diagnostic criteria for ADHD have evolved over time. Various diagnostic models may well define samples that differ in natural history, severity, comorbidity, ADHD subtype, and response to treatment.

  • Comorbid Conditions (question 29, divided into two components, a and b): Information was extracted on whether comorbid disorders had been considered by the authors (question 29a) and on whether the patients included in the studies had comorbid disorders (question 29b). Emphasis was placed on ODD, CD, Tourette's syndrome, anxiety disorder, depressive disorder, learning disorder, and mental retardation. Information on the presence of comorbid disorders is important in order to judge the generalizability of the results of a study to a particular clinical setting. In addition, comorbid disorders may be associated with different responses to treatment or different levels of adherence to treatment.
  • Individual(s) Who Made the Diagnosis (question 30): In general, low agreement occurs among informants on the presence of the core symptoms of ADHD. Current diagnostic models accord considerable importance to evidence for pervasive symptoms. Reliance on a single informant may generate biased study samples differing in severity or comorbidity from those generated by other informants. For example, teacher-rated ADHD is more strongly associated with academic achievement than is parent-rated ADHD (Szatmari, Offord, and Boyle, 1989).

Treatment-Related Issues

Identification of the Primary Outcome (question 23 in Appendix E): If the primary outcome is not specified a priori (or not specified at all) and all outcomes in a study are treated alike, authors are more likely to highlight those with the most striking results. In addition, the more outcomes that are analyzed, the greater the risk of finding false-positive, statistically significant results merely by chance. Identification of the primary outcome is also an essential step to estimate the power of the study to detect true-negative results.

  • Fidelity and Monitoring of Treatment (question 34): Fidelity reflects the extent to which treatment is delivered correctly. Fidelity also ensures that interventions that are not part of the treatment protocol are not inadvertently administered. Fidelity can be enhanced by training professionals who administer treatment, conduct treatment according to treatment manuals, self-monitor treatment administration, and conduct independent adherence checks. Monitoring and reporting fidelity allows the reader to determine whether potentially effective treatments were fairly tested. The treatment manuals developed to support clinical trials can assist the dissemination of effective treatments to community practitioners.
  • Measurement of Compliance with Treatment (question 35): Compliance reflects the extent to which patients correctly carry out treatment plans. Compliance might require the timely administration of correct doses of medication, the completion of parent training homework projects, or the consistent application of classroom behavior management strategies by teachers.
  • Availability of Baseline Test Scores in the Report (question 36): Baseline test scores permit identification of atypical samples (severity, comorbidity) and evaluation of the comparability of the subjects in various study groups at the outset.

In addition to the methodological information, data were also extracted on each of the arms, outcomes, and tests used in each of the studies (Appendix E, Study Arm Form and Outcome/Adverse Effects Form). The outcomes of interest selected a priori by the panel of experts (led by the AAP Task Force) included the following:

  • Core/Global "Symptoms": included global assessment of all symptoms, global assessment of core symptoms, and function performance.
  • Individual Core "Symptoms": included separate data for inattention (including inattention itself or listening), hyperactivity, and impulsivity (included impulsivity and self-control-related outcomes). Although these are often described as "symptoms" they are really "signs" of ADHD, as they are observed phenomena rather than subjective experiences of the patients. To conform to tradition, however, the term "symptom" will be used throughout the report.
  • School/academic Performance: included achievement tests, grades, verbal skills, reading, mathematics, spelling, and measures of social competence.
  • Depression/anxiety-related Outcomes: included measures of depression and anxiety, as well as emotional well-being, crying, sadness, global mood, and self-esteem.
  • Conduct/oppositional-disorder-related Outcomes: included specific measures for ODD, CD, aggressiveness, and other behavior disturbances.
  • Adverse Effects: as described above in the section inclusion/exclusion criteria.

Information on these outcomes was extracted regardless of the instrument or test used by the researchers within each of the studies. Only information gathered during the duration of the participants' being in the randomized groups was extracted. No data gathered after discontinuation of therapy were sought.

It is important to stress at this point that standards of quality in research are constantly being raised and that the assessment of the rigor of the studies included in this report was performed using current parameters. Early research that may be found liable to bias today could have been regarded as state of the art at the time it was conducted or published.

Data Synthesis

Descriptive statistics were calculated for all the fields of the database. Evidence tables were constructed to describe the most salient characteristics of the eligible studies. These tables, which can be found after the References in this report, summarize, globally and category by category, all the information that was extracted.

Within each of the categories, four sets of tables were produced. The first three sets were regarded as the main evidence tables to support the information included in the Findings chapter, whereas the fourth was regarded as a separate source of information for readers interested in the data extracted in a very detailed fashion. The sets of tables have the following general structure:

  • The first set contains a summary of the key characteristics of the studies reviewed, including the name of the first author, the year of publication of the study, the type of study design, the number of patients randomized (or the number of patients analyzed, if the number of patients randomized was not available), the diagnosis model, the interventions studied, the duration of exposure to each of the interventions, the quality scores, the number of elements extracted from the articles among those regarded as essential to judge the validity of each of the studies but that were not supported by empirical evidence (20 in total), the outcomes of interest measured (following the list provided above), and the key results for the outcomes of interest (expressed in terms of statistical significance set at p<0.05). Data on outcomes other than those identified a priori as being of interest were not reported in these tables. They were mentioned in the text of the report when they had clinical significance.
  • The second set of tables includes a summary of the presence or absence in each of the studies of each of the 20 elements extracted from the articles that were regarded as essential to judge the validity of each of the studies but that were not supported by empirical evidence. The last column of these tables provides a count of these elements by article. This count should not be regarded as a quality score because no evidence exists on the relative weights of each of these elements. The counts were included to provide the reader with additional information to guide his or her decisions. The cells of all the other columns only include 1 and 0, representing whether the element was present or not in the articles, respectively.
  • The third set of tables has the same structure as the second set, but instead of 1 and 0, the cells include the actual information provided in the study reports and extracted as part of the review.
  • The fourth set of tables focuses on the actual results of the studies. It contains the name of the first author, the year of publication, the interventions, the tests used to measure the outcomes, data on each of the outcomes of interest measured with each of the tests, and the statistical significance (p<0.05) of the results on the outcomes of interest across the study groups. This table, which was requested by the AAP subcommittee, was subdivided by group of outcomes to help establish whether the effect of the interventions varies by outcome (e.g., intervention A may improve core symptoms but not academic achievement). One limitation for the production and interpretation of this set of tables derives from the fact that many tests were used to measure the outcomes and that even when the same test was used, the authors often made modifications that were not clearly described in the reports. Because of the magnitude and complementary nature of the tables, they are included in the report as a separate section. To make the data extracted easier to understand, the information on the studies was organized in the tables following an alphabetical order. The studies are also presented in alphabetical order within the text. In addition, every effort was made to describe the studies according to their methodological quality, taking into account both the evidence-based criteria and the count of the elements that were not supported by empirical evidence.

The local research team, in consultation with members of the partner organizations and the TOO, evaluated the overall quantity and quality of the data available. This evaluation led to the conclusion that meta-analysis would be inappropriate to summarize the evidence on each of the research questions or for each of the main categories of interest. The main reasons for this decision were substantial clinical heterogeneity across the studies (e.g., therapies evaluated, patient populations, duration), inconsistency in outcome measurements, low methodological quality, and incomplete data reporting (see detailed descriptions within each category). The use of meta-analysis to synthesize this type of data has been associated with a greater chance of obtaining imprecise and potentially misleading results (Ioannidis, Cappelleri, and Lau, 1998). Therefore, this report represents a systematic qualitative review of the existing evidence, emphasizing the implications for clinical practice and the directions that future researchers could take to fill existing knowledge gaps.

Every effort was made to present the information obtained in each of the categories following a uniform format. Overall, each section begins with a general description of the most salient characteristics of the studies and ends with a summary of the main findings of the studies. The way in which the results from the individual studies are presented varies substantially across sections. This reflects the different perspectives that the questions provide on the treatment of ADHD. For instance, two of the sections focus on pharmacological interventions (e.g., drug-to-drug comparisons, and antidepressants vs. placebo), whereas others focus on combined interventions, length of therapy, specific populations (e.g., adults), and outcomes (e.g., adverse effects). The tables complementing the text in each of the sections, as mentioned above, follow the same format.

Consultation With Partners

The production of the evidence report involved continuous communication among members of the local research team, the TOO, and representatives of the partner organizations.Three face-to-face meetings were held; and communication by telephone, fax, and electronic mail was frequent. The partner organizations provided input during each of the steps of the process, from the formulation of the questions through the literature search and data extraction, to data synthesis.

Peer-Review Process

Potential peer-reviewers were identified in April 1998. Thirty-eight individuals were approached by the San Francisco Cochrane Center and asked if they would be willing to review this evidence report. Twenty of these people responded positively to this request, and the report was sent out to them in early July. The potential reviewers included 14 content experts (practicing physicians, psychologists, researchers, representatives of organizations interested in the treatment of ADHD), 3 consumer representatives, and 3 methodologists. All reviewers were sent a "Structured Format for Referee's Comments" (Appendix G) and encouraged to provide comments on the text. The comments were returned to our Criticism Editor, Dr. Patricia Huston, who synthesized them for the McMaster EPC. All comments were shared with the Task Order Officer. All comments were reviewed and, where feasible, incorporated into the final report. Comments were received from 15 of the 20 reviewers who were invited to comment (Appendix G).

Views

  • PubReader
  • Print View
  • Cite this Page

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...