Clinical effectiveness: overview of included studies

Amy O’Donnell; Catherine McParlin; Stephen C Robson; Fiona Beyer; Eoin Moloney; Andrew Bryant; Jennifer Bradley; Colin Muirhead; Catherine Nelson-Piercy; Dorothy Newbury-Birch; Justine Norman; Emma Simpson; Brian Swallow; Laura Yates; Luke Vale

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

O’Donnell A, McParlin C, Robson SC, et al. Treatments for hyperemesis gravidarum and nausea and vomiting in pregnancy: a systematic review and economic assessment. Southampton (UK): NIHR Journals Library; 2016 Oct. (Health Technology Assessment, No. 20.74.)

Treatments for hyperemesis gravidarum and nausea and vomiting in pregnancy: a systematic review and economic assessment.

Show details

Contents

< Prev Next >

Chapter 3Clinical effectiveness: overview of included studies

Studies identified

A flow chart of the studies is shown in Figure 2. In total, 11,830 papers were identified from the combination of standard electronic databases (n = 11,659), specialist Chinese databases (n = 102) and various sources of grey literature (n = 69). Of these, 5152 duplicate papers were identified and deleted (5150 from the standard electronic databases, and two from the grey literature).

FIGURE 2

Flow chart of clinical effectiveness literature.

The deletion of duplicate papers left 6678 individual papers for assessment. After screening titles and abstracts, 322 papers were identified as of potential relevance and full-text copies of 309 papers were obtained (with the remainder unobtainable). Of these, 96 were judged ineligible for the effectiveness review and immediately excluded (narrative overviews, systematic literature reviews or economic evaluations). After the second exclusion process, comprising more detailed reading of each full-text paper, a further 138 papers were judged not to meet the inclusion criteria of the review and were also excluded.

Key reasons for exclusion were duplicate paper already included; participant inclusion criteria for the identified study judged not relevant to our review; did not include any of the pre-specified outcomes; or ineligible study design (no comparator group).

As a result, 75 papers were identified for data extraction, from a total of 73 separate studies. A full list of included studies is provided as Appendix 5. A table of excluded studies detailing reasons for exclusion is provided as Appendix 6.

Quality of included studies

Randomised controlled trials

Overall risk of bias

The results of the quality assessment procedure for the 64 included RCTs (reported in 66 papers) are displayed in Figure 3 and Table 4. There was variation both in terms of the quality of the studies and the quality of the reporting. In a large number of papers, there was insufficient detail provided to permit clear judgement of risk of bias in a range of key areas. Overall, 33 RCTs were classed as having low within-study risk of bias, 11 RCTs were classed as having high within-study risk of bias, and the remainder (n = 20) were classed as unclear in this respect. The high proportion of studies at unclear risk of bias was due to poor reporting and a lack of detail, particularly in the methods section. There were also a number of publications in abstract form only. As an unclear judgement was often due to poor reporting rather than specific methodological concerns, it was not judged appropriate to categorise studies with those deemed at high risk of bias as a result of more serious methodological flaws. Our robust approach to the assessment of the overall risk of bias within individual studies is described in more detail in Chapter 2, Risk of bias in included studies and quality assessment. More detail is provided below to illustrate the range in quality in terms of each individual component of the Cochrane’s risk of bias tool.⁴⁶

FIGURE 3

Risk of bias graph: review authors’ judgements about each risk of bias item presented as percentages across all included RCT studies.

TABLE 4

Risk of bias summary: review authors’ judgements about each risk of bias item for included RCTs

Random sequence generation

The risk of bias arising from the method of generation of the allocation sequence was low in 39 of the included RCTs.¹³^,⁴¹^,⁵⁷^,⁶⁰^,⁶²^–⁶⁴^,⁶⁶^,⁶⁷^,⁶⁹^,⁷⁰^,⁷⁶^,⁸⁰^–⁸³^,⁸⁵^–⁸⁹^,⁹¹^,⁹³^,⁹⁴^,⁹⁸^–¹⁰⁴^,¹⁰⁶^–¹⁰⁸^,¹¹⁰^,¹¹²^–¹¹⁵ Methods employed included random number tables, computer-generated sequence generation⁶¹^,⁶³^,⁶⁴^,⁸¹^–⁸³^,⁸⁸^,⁹³^,⁹⁴^,⁹⁸^,⁹⁹^,¹⁰⁶^,¹⁰⁸^,¹¹²^,¹¹⁴ and randomised block design.⁶²^,⁶⁷^,⁸⁰^,⁸⁷^,⁸⁹^,⁹¹^,¹⁰¹^,¹⁰²^,¹⁰⁷^,¹¹³^,¹¹⁵^,¹¹⁷ One trial was classed as high risk because women were asked to draw an envelope from a box with the same appearance but with different contents.¹¹¹ It was categorised as unclear in the remaining 24 RCTs due to insufficient information provided by the authors to permit judgement either way.⁴²^,⁴³^,⁵⁸^,⁵⁹^,⁶⁵^,⁶⁸^,⁷¹^–⁷⁵^,⁷⁷^–⁷⁹^,⁸⁴^,⁹⁰^,⁹²^,⁹⁵^–⁹⁷^,¹⁰⁴^,¹⁰⁵^,¹⁰⁹^,¹¹⁶

Allocation concealment

Thirty studies employed allocation concealment methods judged to carry low risk of bias, such as the use of sequentially numbered sealed opaque envelopes containing allocation assignment.⁵⁷^,⁶⁰^,⁶¹^,⁶³^,⁶⁴^,⁶⁶^,⁶⁷^,⁷³^,⁷⁶^,⁷⁸^,⁸⁰^–⁸⁵^,⁸⁸^,⁸⁹^,⁹³^,⁹⁸^,⁹⁹^,¹⁰²^,¹⁰³^,¹⁰⁵^–¹⁰⁸^,¹¹²^,¹¹⁸ Thirty studies did not provide sufficient information to allow a judgement of low or high risk and were therefore classed as unclear.¹³^,⁴¹^–⁴³^,⁵⁸^,⁵⁹^,⁶²^,⁶⁵^,⁶⁸^–⁷¹^,⁷⁴^,⁷⁵^,⁷⁷^,⁷⁹^,⁹⁰^,⁹¹^,⁹⁴^–⁹⁷^,¹⁰⁰^,¹⁰⁴^,¹⁰⁹^,¹¹¹^,¹¹³^,¹¹⁴^,¹¹⁶^,¹¹⁷ The remaining four RCTs were judged as having high risk of allocation concealment bias.⁷²^,⁸⁷^,⁹²^,¹¹⁵ For example, one study stated that patients were randomly divided into two groups by those involved in the study,⁷² or the nature of the intervention being tested meant it was not possible to conceal allocation.⁸⁷^,⁹²^,¹¹⁵

Blinding of participants, personnel and outcome assessors

Of the included RCT studies, 32 were judged to have low risk of bias in relation to the blinding of participants and other personnel involved in the trial,⁴¹^,⁵⁷^,⁵⁹^–⁶⁴^,⁶⁷^,⁷¹^,⁷²^,⁷⁴^,⁷⁹^,⁸¹^–⁸⁵^,⁹⁰^,⁹³^,⁹⁹^,¹⁰⁰^,¹⁰²^,¹⁰³^,¹⁰⁵^,¹⁰⁶^,¹⁰⁸^,¹¹⁰^,¹¹²^,¹¹³^,¹¹⁶^,¹¹⁷ generally through the provision of medication in identical formats for both active and placebo. Sixteen studies were judged to have high risk of bias in this respect, for example due to clear differences in either the appearance, dosage rates or mode of delivery between intervention and placebo comparator, or as a result of evidence that the research staff involved were aware of allocation status.¹³^,⁵⁸^,⁶⁶^,⁶⁸^,⁷⁵^,⁷⁷^,⁷⁸^,⁸⁰^,⁸⁷^–⁸⁹^,⁹¹^,⁹²^,⁹⁶^,¹⁰¹^,¹¹⁵ In some instances, however, despite lack of blinding, the nature of the intervention meant that this was not relevant; for example, in McParlin and colleagues⁸⁸ where blinding of participants and staff was not possible as the packages of care delivered to the intervention and control groups varied in content. However, it is important to highlight that although it might not have been possible to blind patients or clinicians, outcome assessors and analysts handling the resultant data may nevertheless have been blinded. The remaining 16 studies did not provide sufficient information to permit a judgement of low or high bias, often due to imprecise, poor reporting, and were thus classed as unclear.⁴²^,⁴³^,⁶⁵^,⁶⁹^,⁷⁰^,⁷³^,⁷⁶^,⁹⁴^,⁹⁵^,⁹⁷^,⁹⁸^,¹⁰⁴^,¹⁰⁷^,¹⁰⁹^,¹¹¹^,¹¹⁴

Incomplete outcome data

Most studies (n = 50) were judged as carrying low risk of bias in relation to this component.¹³^,⁴¹^,⁴²^,⁵⁷^,⁵⁹^–⁶¹^,⁶³^–⁶⁵^,⁶⁷^,⁷⁰^–⁷²^,⁷⁴^–⁷⁶^,⁷⁸^,⁸⁰^–⁸⁵^,⁸⁷^–⁸⁹^,⁹¹^–⁹⁴^,⁹⁶^–⁹⁹^,¹⁰²^–¹⁰⁸^,¹¹⁰^,¹¹²^–¹¹⁸ Although published protocols were rarely available, all data for the primary outcomes pre-specified in the paper were reported for all randomised participants, or rates of drop-out were either sufficiently low (< 20%), or proportionately comparable between groups, so that it was not considered likely to result in a clinically relevant bias. Three studies displayed a high risk of bias in this regard, all as a result of high numbers of participant drop-outs.⁶²^,⁶⁸^,¹¹¹ The remainder (11 studies in total) were judged as unclear due to lack of sufficient information.⁴³^,⁵⁸^,⁶⁶^,⁶⁹^,⁷³^,⁷⁷^,⁷⁹^,⁹⁰^,⁹⁵^,¹⁰⁰^,¹⁰⁹

Selective outcome reporting

Six studies were judged as having high risk of bias in terms of selective outcome reporting, due to either not reporting data for pre-specified outcomes, or for reporting data in the results that were not pre-specified in either the original study protocol or methods section.⁸⁷^,⁹⁰^,⁹⁴^,¹¹³^–¹¹⁵ Forty-five studies were classed as having low risk of bias, with all outcomes specified and subsequently reported.¹³^,⁴¹^,⁴³^,⁵⁷^,⁵⁹^–⁶⁷^,⁷⁰^–⁷²^,⁷⁴^–⁷⁶^,⁷⁸^,⁸⁰^,⁸¹^,⁸³^–⁸⁵^,⁸⁸^,⁸⁹^,⁹¹^–⁹³^,⁹⁶^–⁹⁹^,¹⁰¹^–¹⁰⁴^,¹⁰⁶^–¹⁰⁸^,¹¹⁰^,¹¹²^,¹¹⁶^,¹¹⁷ Risk of bias was judged as unclear for the final 13 studies.⁴²^,⁵⁸^,⁶⁸^,⁶⁹^,⁷³^,⁷⁷^,⁷⁹^,⁸²^,⁹⁵^,¹⁰⁰^,¹⁰⁵^,¹⁰⁹^,¹¹¹

Other sources of bias

Twenty of the included RCT studies were judged as having low risk of bias in this area.¹³^,⁴¹^,⁴²^,⁶¹^,⁶²^,⁶⁴^,⁶⁷^,⁷⁰^,⁷¹^,⁷⁴^,⁸¹^,⁸³^,⁸⁸^,⁹⁶^,¹⁰¹^–¹⁰³^,¹⁰⁶^,¹¹⁰^,¹¹² However, a substantial number (n = 44) were classed as unclear, due to lack of sufficient information in the paper to permit detailed assessment of whether or not an important risk of bias existed, or due to insufficient rationale or evidence that an identified problem had introduced serious levels of bias to the study.⁴³^,⁵⁷^–⁶⁰^,⁶⁵^,⁶⁶^,⁶⁸^,⁶⁹^,⁷²^,⁷³^,⁷⁵^–⁸⁰^,⁸²^,⁸⁴^,⁸⁵^,⁸⁷^,⁸⁹^–⁹⁵^,⁹⁷^–¹⁰⁰^,¹⁰⁴^,¹⁰⁵^,¹⁰⁷^–¹⁰⁹^,¹¹¹^,¹¹³^–¹¹⁷^,¹¹⁹ For example, in one paper,⁷⁶ lack of reporting of full results for the control group resulted in an unclear judgement in this area.

Case series studies

The nine case series or non-randomised studies were quality assessed using the component-based EPHPP tool,⁴⁷ which appraises studies on the basis of six core components, rated 1–4 (where 1 is deemed to be the highest quality of study). These areas are selection bias; strength of overall study design; extent to which confounders were identified and controlled for in the study; blinding of participants and/or research personnel; approach to data collection; and rate of withdrawals/drop-outs from study. As shown in the Table 5, all studies were judged as weak in terms of quality (which corresponds to a high risk of bias judgement using the standard Cochrane approach for RCTs).

TABLE 5

Study quality summary: review authors’ judgements about each risk of bias item for each included case series or non-randomised study

Interventions and comparators

The included studies were grouped into the three broad groups of interventions outlined in Chapter 1: patient-initiated first-line interventions; clinician-prescribed second-line interventions; and clinician-prescribed third-line interventions. It should be noted that, for patient-initiated first-line interventions, the only studies identified that could be classified as lifestyle interventions were those which trialled ginger preparations and/or vitamin B6. No studies of dietary- or hypnotherapy-based interventions were identified. However, studies of a number of novel therapies not covered by our original review protocol were identified, namely the use of aromatherapy, transdermal clonidine and gabapentin. The studies comprising the evidence base for each group of interventions are detailed in Table 6. Note that all studies are two-arm RCTs unless otherwise stated.

TABLE 6

Number of studies by intervention and comparator

In addition, the network plot (Figure 4) shows the range of interventions from all comparative studies included in the review. Individual interventions have been grouped where appropriate.

FIGURE 4

Network plot of range of interventions and comparisons for NVP/HG. Size of node is proportional to frequency of intervention and width of line to frequency of comparisons between two interventions. Plot does not include one pre-emptive trial, outpatient (more...)

The size of the nodes in the network plot is proportional to the frequency of the intervention in the review, and the width of the lines indicates the frequency of the comparisons made between two interventions. These nodes and lines, however, do not represent the weight of evidence in the review as this would also be influenced by sample size and the precision of estimates, as well as other factors. The plot did not include a trial on pre-emptive treatment of doxylamine/pyridoxine combination, outpatient versus inpatient care¹¹⁷^,¹²⁷ or two four-arm trials,⁶⁸^,¹⁰¹ which would have over-reported the number of comparisons in the network plot. These interventions included dietary instructions only, or together with either placebo, antihistamines or antihistamine/vitamin B6 combination in one trial⁶⁸ and traditional acupuncture, P6 acupuncture, placebo or no acupuncture in another trial.¹⁰¹ Ginger, vitamin B6, antihistamines, acupressure, metoclopramide, corticosteroids, doxylamine/pyridoxine combination and the serotonin antagonist ondansetron are more widely reported than other interventions, but there is also information on interventions such as acupuncture, nerve stimulation therapy and aromatherapy oils which have been considered as treatments for NVP/HG. Evidence on the effects of interventions such as Chinese herbal medicine, dextrose saline, transdermal clonidine and diazepam is very limited and in most cases is reported in single trials. As expected, placebo interventions are most widely reported as comparators, and so this has the biggest node on the network plot (emphasised by the square node). The most commonly reported treatment comparisons are ginger capsules versus placebo; acupressure versus placebo; ginger capsules versus vitamin B6 capsules; corticosteroids versus ‘treatment as usual’; metoclopramide versus ondansetron; and acupuncture versus nocebo (nocebo is an inert intervention that creates comparable side effects/harmful effects in a patient, as opposed a placebo, which is an inert substance that creates either a beneficial response or no response in a patient).

Participants and symptom severity

In addition to substantial variation in terms of the range of interventions and comparators evident within the literature, it is also important to highlight the heterogeneity of symptom severity found among patient populations.

It was initially intended that as part of this review only studies that recruited women with severe NVP or HG would be included. However, assessment of symptom severity varied within and across studies, and it was not possible to easily place every participant population into categories. We therefore attempted to categorise the symptom severity of participants for each study, using the description of severity in the inclusion criteria and, if available, any severity score given at baseline. These two items of information were assessed by two independent assessors (CMP and SCR) to assign severity as mild, moderate, severe or unclear. Agreement was reached for all but one study, which was classified as unclear.

This classification was then used in each results chapter to describe symptoms and outcomes in terms of severity.

Outcome measures

Finally, and linked to the issues discussed above, the identified literature in this field was also characterised by the range of symptom severity scales employed from study to study to assess intervention outcomes. Out of the 73 included studies (reported in 75 papers), only 23 used validated NVP/HG assessment scales such as PUQE (10 studies), RINVR (11 studies) or the McGill Nausea Questionnaire (one study). Thirty-one studies assessed nausea and/or vomiting severity using a 10-point VAS. Twenty-one studies employed either a study-specific, non-validated author-defined assessment scale (including, for example, numbers of episodes of vomiting combined with the use of a Likert scale to assess subjective feelings of symptom severity among participants), or used the various proxy measures of symptom severity outlined in our protocol [e.g. percentage weight loss, length of hospital stay, or hospital (re-)admission episodes]. Table 7 illustrates the primary symptom severity outcome measures employed by each included study.

TABLE 7

Validated and non-validated symptom severity measures employed by each included study

Additional sources of outcome data on medications

The UKTIS is currently commissioned by Public Health England to provide advice to UK health professionals on the fetal effects of therapeutic, poisoning and chemical exposures in pregnancy, and to conduct surveillance of known and emerging teratogens. The UKTIS database currently contains a record of just under 60,000 enquiries dating back to 1978, of which 320 relate to use of specific drugs in the treatment of HG (period of enquiry 18 June 1978 to 18 March 2014). Surveillance data collected by the UKTIS are reviewed periodically and published in UKTIS monographs through the National Poisons Information Service database (www.TOXBASE.org). Data collected by the UKTIS in relation to medications for NVP/HG, including specific monograph data on ginger, vitamin B6, vitamin B12, promethazine and olanzapine, are provided in Table 43, Appendix 7 for information.

Meta-analysis of included randomised controlled trials

As highlighted in the previous sections, there was wide variation across studies. Specifically, there was considerable heterogeneity between interventions within each of the categories of comparisons, and in terms of how interventions were administered/delivered. The measurement of outcomes also differed substantially between trials reporting the same comparisons, so in most cases the trials were not directly comparable. In a meta-analysis it is important not to combine outcomes that are too diverse; even if it had been possible to extract data for a meta-analysis, such an analysis is likely to produced misleading results due to the considerable heterogeneity between studies.⁴⁶ Furthermore, many of these trials were extremely poorly reported and their conduct was often uncertain. In summary, clinical and methodological variations between studies were considerable, and the intervention effect was likely to be affected by the factors that varied across studies. Consequently, we have not conducted a meta-analysis of findings from the RCTs.

Structure of individual results chapters

The following chapters present more detailed findings from the evidence review for each individual intervention. As already indicated, given it was not possible to meta-analyse the data from individual studies for any group of interventions and comparators, the results are summarised in narrative form. The narrative content of each chapter focuses on the findings from the included studies in terms of their reported effectiveness for addressing our primary outcomes of interest, that is, the key symptoms associated with HG/NVP. Thus, where available, effectiveness is reported in terms of the validated overall HG/NVP assessment scales (PUQE, RINVR or McGill Nausea Questionnaire). Otherwise, the effectiveness of interventions is reported in relation to their impact on the three key symptoms: nausea, vomiting and retching. Data illustrating significant results in relation to these key symptoms are detailed in the narrative text; otherwise, results are described as not significance or not clear. Data for case series studies are not included in the narrative but available in the accompanying results tables for information. Additional secondary outcome data reported by included studies (see Table 2 for a full list) are presented in Appendix 8.

Copyright © Queen’s Printer and Controller of HMSO 2016. This work was produced by O’Donnell et al. under the terms of a commissioning contract issued by the Secretary of State for Health. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.

Included under terms of UK Non-commercial Government License.

Bookshelf ID: NBK390545

Contents