U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Thangaratinam S, Rogozińska E, Jolly K, et al. Interventions to Reduce or Prevent Obesity in Pregnant Women: A Systematic Review. Southampton (UK): NIHR Journals Library; 2012 Jul. (Health Technology Assessment, No. 16.31.)

Cover of Interventions to Reduce or Prevent Obesity in Pregnant Women: A Systematic Review

Interventions to Reduce or Prevent Obesity in Pregnant Women: A Systematic Review.

Show details

2Systematic review methods

Protocol development

Systematic reviews of the effectiveness of and harm caused by interventions were carried out using methodology2527 in line with the recommendations of the NHS Centre for Reviews and Dissemination and the Cochrane Collaboration, including the Cochrane Adverse Methods Subgroup.2533 The systematic reviews of effectiveness and of adverse effects were carried out simultaneously.

The protocol for this review included the following: a detailed literature search to identify all relevant citations, prioritisation of outcomes relevant to clinical practice by Delphi survey, assessment of the risk of bias for the individual studies and evaluation of the strength of evidence for individual outcomes using GRADE (Grading of Recommendations Assessment, Development and Evaluation) methodology.

Research question

The structured question addressed by the project is given in Table 1.

TABLE 1. The research question addressed by the project.

TABLE 1

The research question addressed by the project.

Methods for effectiveness review

Search strategy

A detailed search of the relevant published and unpublished literature was conducted by constructing a comprehensive search strategy for the effectiveness of dietary and lifestyle interventions in pregnancy. The following databases were searched: MEDLINE, EMBASE, BIOSIS, Latin American and Caribbean Health Sciences Literature (LILACS), Science Citation Index, Cochrane Database of Systematic Reviews (CDSR), Cochrane Central Register of Controlled Trials (CENTRAL), Database of Abstracts of Reviews of Effects (DARE), HTA database and PsycINFO. In addition, information on studies in progress and unpublished research or research reported in the grey literature were sought by searching a range of relevant databases including Inside Conferences, Systems for Information in Grey Literature (SIGLE), Dissertation Abstracts and ClinicalTrials.gov. Internet searches were also carried out using specialist search gateways (such as OMNI: www.omni.ac.uk/), general search engines (such as Google: www.google.co.uk/) and meta-search engines (such as Copernic: www.copernic.com/). The aim was to identify all studies evaluating the effectiveness of interventions for weight management in pregnancy.

The search strategy was designed in a multistep process by combining search terms related to pregnancy and weight. The search was limited by including search filters for ‘human studies’ and ‘study type’ (randomised clinical trials and observational trials without case series and case studies). Existing search strategies or filters, such as the InterTASC Information Specialists' Sub-Group Search Filter Resource, were used to develop the search strategy with some modifications as needed. No further limitations were applied. The detailed search strategy for effectiveness is provided in Appendix 2. MEDLINE and EMBASE were searched from inception to May 2010. Other databases were searched from inception to June 2010. The search was repeated and updated until March 2011. A comprehensive master database of articles was constructed using Reference Manager 12.0® software (Thomson Reuters, New York, NY, USA).

Inclusion criteria

The criteria for inclusion of studies in the effectiveness review are described in the following sections.

Population

Pregnant women expecting one or more than one baby (i.e. twins or triplets) were included. We included women who were of normal weight (BMI 18.5–24.9 kg/m2), overweight (BMI 25–29.9 kg/m2) or obese (BMI ≥ 30 kg/m2). We excluded pregnant women who were underweight (BMI < 18.5 kg/m2).

Setting

Any setting including primary care or secondary and tertiary units.

Interventions

We included any dietary, physical activity and behavioural change intervention that has the potential to influence weight change in pregnancy. Studies that evaluated interventions mainly based on dietary advice were classified in the dietary interventions group. Interventions primarily based on physical activities such as swimming, running and aerobic exercise were classified in the physical activity group. The mixed approach interventions group included studies that employed diet and physical activity components that may, or may not, be underpinned by behavioural theory. Table 2 lists the various interventions reviewed.

TABLE 2. Interventions and intervention providers for weight management in pregnancy.

TABLE 2

Interventions and intervention providers for weight management in pregnancy.

Comparison

The control group consisted of women with no intervention or routine antenatal care. In women with obstetric or medical complications the care provided was appropriate to the condition (e.g. insulin in diabetic women).

Outcomes

The maternal and fetal outcomes included in the review are provided in Table 3.

TABLE 3. Maternal and fetal outcomes evaluated in the review.

TABLE 3

Maternal and fetal outcomes evaluated in the review.

Study design

We included randomised controlled trials (RCTs) evaluating the effectiveness of dietary and lifestyle weight management interventions in pregnancy for maternal and fetal outcomes. Non-randomised studies (NRSs) and observational studies (cohort and case–control) were included in the analysis only when the evidence from RCTs was insufficient. Studies that did not provide data to estimate effectiveness measures such as relative risk (RR) or mean difference (MD) were excluded.

Subgroups

The following subgroups were specified a priori and reported in the review:

  • intervention: dietary, physical activity and mixed approach interventions
  • BMI: obese only, obese and overweight and mixed-group populations
  • setting: studies in developed countries and developing countries
  • year of publication: studies published before 1990 and since 1990
  • diabetes in pregnancy
  • responders to the intervention with significant reduction in gestational weight gain.

Study selection

Study selection was conducted in two stages: an initial screening of titles and abstracts against the inclusion criteria to identify potentially relevant papers followed by screening of the full papers of the identified citations without language restrictions. Two reviewers independently assessed each citation (ER and SG) for inclusion in the review. Any differences in opinion were resolved by discussion and by involving a third reviewer. Further information was sought from the study authors if required. The process of study identification and selection is presented in Figure 2, consistent with the PRISMA guidelines.

Study quality assessment

The studies were classified by study design according to the NICE guidelines algorithm for classifying quantitative study designs.34 Quality assessment was carried out separately for the different study designs (RCTs, NRSs and observational studies).

Randomised controlled trials

We assessed the risk of bias – selection bias, performance bias, measurement bias and attrition bias – in line with the recommendations made in the Cochrane handbook for systematic reviews of interventions.35 Study quality was assessed in six domains: sequence generation, allocation sequence concealment, blinding, incomplete outcome data, selective outcome reporting and other potential sources of bias.

Sequence generation

An adequate sequence generation should describe the method used to generate the allocation sequence in sufficient detail to allow an assessment of whether or not it should produce comparable groups. The use of a random component was considered to be adequate sequence generation. Systematic methods, such as alternation or assignment based on date of birth, case record number or date of presentation, were considered to be inadequate.

Allocation concealment

A study was categorised as being at low risk of bias for allocation concealment if it described the method used to conceal the allocation sequence in sufficient detail to determine whether intervention allocations could have been foreseen in advance of, or during, enrolment.

The quality of allocation concealment was chosen using the following criteria:

  • adequate concealment of allocation, such as telephone randomisation, consecutively numbered sealed opaque envelopes
  • unclear whether adequate concealment of allocation
  • inadequate concealment of allocation such as random number tables, sealed envelopes that are not numbered or opaque.

Where the method of allocation concealment was unclear, whenever possible attempts were made to contact authors to provide further details.

Blinding

Adequate blinding described all measures used, if any, to blind study participants and personnel from knowledge of which intervention a participant received. It should also provide any information relating to whether or not the intended blinding was effective. In assessing the risk of bias from blinding, we specifically assessed who was and who was not blinded. Furthermore, we also assessed separately the risk of bias for subjective and objective outcomes.

Incomplete outcome data

We evaluated the completeness of outcome data for each main outcome, including attrition and exclusions from the analysis. We assessed whether attrition and exclusions were reported, the numbers in each intervention group (compared with the total number of randomised participants), reasons for attrition or exclusions where reported and any reinclusions in the analyses.

A study was considered to be at low risk of bias for missing outcome data when we were confident that the participants included in the analysis were exactly those who were randomised into the trial. The risk of bias was considered to be unclear if the numbers randomised into each intervention group were not clearly reported. A study was labelled as having a high risk of bias for missing outcome data when there was a difference in the proportion of incomplete outcome data across groups and the availability of outcome data was determined by the participants' true outcomes.

Selective outcome reporting

We compared the outcomes reported in the individual studies with the rest of the studies to assess the possibility of selective outcome reporting. The risk of this bias was assessed at the study level.

Other sources of bias

Any other important concerns about bias not addressed in the above domains were highlighted as other sources of bias. The proportions of studies with various risks of bias are shown in Appendix 4. The entries for each domain were marked as ‘Yes’, ‘No’ or ‘Unclear’ as appropriate.

Non-randomised studies

Quality assessment of NRSs was performed using a methodology checklist presented in Appendix 5. The Newcastle–Ottawa scale (NOS) was used to assess the quality of the observational comparative studies with cohort and case–control designs.25 The cohort studies were assessed for the following risks of bias:

  • selection of cohorts regarding the representativeness and selection of the exposed cohort, ascertainment of exposure and that the outcome of interest was not present at the start of study
  • comparability of the cohorts based on methods or analysis
  • assessment of outcome by evaluating the details of outcome assessment, adequacy of length of follow-up for the outcomes to appear and adequacy of follow-up of the cohorts.

The case–control studies were evaluated for the following risks of bias:

  • selection of cases and controls, assessing representativeness and adequate definition of the cases and adequate selection and definition of the controls
  • comparability of the cases and controls
  • ascertainment of exposure, method of ascertaining exposure of the cases and controls and rates of non-response in the groups.

The studies are allocated stars according to the rating. A study can be awarded a maximum of four stars for selection, two for comparability and three for ascertainment of exposure.36

Data extraction

Study clinical characteristics and findings were extracted in duplicate by independent reviewers using predesigned and piloted data extraction forms. Any disagreements were resolved by consensus and/or arbitration involving a third reviewer. Missing information was obtained from investigators if it was crucial to the subsequent analysis. To avoid introducing bias, unpublished information was treated in the same way as published information. In addition to using multiple reviewers to ensure the reproducibility of the overview, sensitivity analyses around important or questionable judgements regarding the inclusion or exclusion of studies, the validity assessments and data extraction were performed. A copy of the data extraction form for the effectiveness review is provided in Appendix 18.

Data synthesis

We calculated pooled RRs with 95% confidence intervals (CIs) for dichotomous data. Continuous data were summarised as MD with standard deviation or median change in relation to the baseline. In the case of missing standard deviations, imputation techniques were used based on Cochrane recommendations.35 Separate analyses were performed on randomised and non-randomised data. Non-randomised data were used for outcomes for which there were no RCTs or a very small number of poor-quality RCTs. The I2 statistic was used to assess statistical heterogeneity between trials. In the absence of significant heterogeneity, results were pooled using a fixed-effect model. If substantial heterogeneity was detected (I2 > 50%), possible causes were explored and subgroup analyses for the main outcomes performed. Subgroups defined a priori were BMI of the women, type of intervention, responders, publication year (before and after 1980), study quality and setting. Heterogeneity that was not explained by subgroup analyses was modelled using random-effects analysis where appropriate. For outcomes for which meta-analysis was not appropriate, the RCT and NRS results were presented, where possible, on a forest plot but without summary scores, allowing a visual presentation of the effects of each included trial. For observational studies, a narrative summary of the findings was given. Statistical analysis was performed when sufficient data were presented. RevMan, version 5.0, (The Cochrane Collaboration, The Nordic Cochrane Centre, Copenhagen, Denmark) was used in the statistical analyses.

Methods for adverse effects review

The review of harm of interventions was undertaken based on recommended methods for systematic reviews, particularly those of observational studies and adverse events, including those of the Cochrane Adverse Effects Subgroup.30,3739

Search strategy

The scope of the review of adverse effects of any dietary intervention on pregnant women and their children was purposefully kept broad. This was to identify a variety of adverse effects that were previously not known or recognised. In addition to the search for relevant reviews and primary studies on the effectiveness of interventions, including those that were excluded from the analysis of benefit, we evaluated studies that specifically provided details of adverse effects resulting from the dietary and lifestyle interventions and weight loss in pregnancy. We designed a separate search strategy to identify studies on harm by including adverse effects text words and indexing terms in the databases previously described in the section on the effectiveness review. Existing search strategies or filters, such as the InterTASC Information Specialist Sub-Group Search Filter Resource, were used to develop the search strategy for this review, with some modifications if needed. The search was limited by including search filters for ‘adverse events’, ‘human studies’ and ‘study type’ (exclusion of editorials and letters). The detailed search strategy for adverse effects can be found in Appendix 2. MEDLINE and EMBASE were searched from inception to June 2010. Other databases were searched from inception to July 2010. The search was updated until March 2011.

Inclusion criteria

The criteria for inclusion of studies in the adverse effects review are described in the following sections.

Population

Pregnant women expecting one or more than one baby (i.e. twins or triplets) were included. We included women who were of normal weight (BMI 18.5–24.9 kg/m2), overweight (BMI 25–29.9 kg/m2) or obese (BMI ≥ 30 kg/m2). We excluded pregnant women who were underweight (BMI < 18.5 kg/m2).

Setting

We included studies carried out in any setting including primary care or secondary and tertiary units.

Interventions

Any dietary and physical activity intervention or exposure that has the potential to cause harm to the mother or baby.

Outcomes

We included any clinically significant adverse outcomes in the mother and the child resulting from (1) a dietary intervention or (2) weight change in pregnancy. We also evaluated the most common adverse effects that led to pregnant women discontinuing an intervention.

Study design

Both comparative (RCTs, NRSs and observational studies) and non-comparative studies including case series and case reports were included. This encompassed any publication as an abstract or full text without any language restrictions.

Study selection and quality assessment

Criteria used to assess the quality of studies for the evaluation of adverse effects followed the same concepts as for assessing study quality for effectiveness: assessing risk of bias, inconsistency of results, indirectness of the evidence, imprecision and publication bias. For assessing the risk of bias in estimating adverse event rates associated with weight management interventions in pregnancy24 we took into account existing checklists for the evaluation of randomised and non-randomised studies,39,40 including study design and other features associated with outcome [e.g. small for gestational age (SGA), preterm delivery]. Quality assessment and presentation of results were carried out separately for RCTs, NRSs and observational studies with a control group and for observational studies without a control group (case series, case reports). Additionally, information on weight change per se in mother and baby were also extracted as these could be associated with adverse event rates or severity. The methodological quality of all eligible data sets (‘risk of bias’) was assessed to investigate internal validity (the extent to which the information is probably free of bias) using the following attributes:41

  • reporting of adverse maternal and fetal outcome definitions to reduce bias in ascertainment of denominator data in the series (any published definition reported vs no definition)
  • adequacy of data source to ascertain a capture of denominator data that is as complete as possible (use of multiple data sources, special surveys or clinical studies vs routine registration enrolment in weight loss programmes, in which adequate attribution of cause of harm has been shown to be questionable for maternal and fetal outcomes, leading to substantial under-reporting)
  • use of a robust approach to ascertain that the cause of harm is a representation of the underlying condition that is as true as possible (confidential enquiries, use of multiple sources of outcome vs no special efforts to confirm cause)
  • a sufficiently high proportion of cases with an attributable cause of harm established (< 5% unclassified).

Data extraction

Methods for study selection and data extraction for the adverse event review were similar to those for the effectiveness review. Study clinical characteristics and findings were extracted in duplicate by independent reviewers using a predesigned and piloted data extraction form (see Appendix 19). Any disagreements were resolved by consensus and/or arbitration involving a third reviewer. Missing information was obtained from investigators if it was crucial to subsequent analysis. To avoid introducing bias, unpublished information was treated in the same way as published information. In addition to using multiple reviewers to ensure the reproducibility of the overview, sensitivity analyses around important or questionable judgements regarding the inclusion or exclusion of studies, the validity assessments and data extraction were performed.

Data synthesis

The number of adverse events reported in pregnant women and children was obtained for each intervention to compute a percentage of the total number of women and children in whom the occurrence of a particular adverse event or confirmation of its absence was reported.41 It is inappropriate to calculate adverse event rates from case studies; thus, a qualitative summary was undertaken. Quantitative adverse event rate calculations were restricted to series of women undergoing weight management interventions and weight change as identified from RCTs and observational studies, with and without controls (case series). The adverse events were quantified as RRs and 95% CIs. The point estimates of proportions and their 95% CIs are represented in forest plots to explore heterogeneity, and the possibility of the differences being due to chance was assessed statistically using Cochran's Q test.

Grading of evidence

The quality of the evidence was assessed and reported separately for each outcome following the GRADE methodology. This is because even within one review the quality of the evidence can vary between the outcomes. We defined quality of evidence as ‘the extent of confidence that an estimate of effect is correct’.42 The GRADE system classifies quality of evidence into one of four levels: high, moderate, low and very low (Table 4).

TABLE 4. Quality of evidence and definitions.

TABLE 4

Quality of evidence and definitions.

To assess the quality, we considered, first of all, the risk of bias (internal validity), that is, the extent to which the design, methods, execution and analysis were not controlled for bias in the assessment of effectiveness.30 Furthermore, we explored the (in)consistency of results (heterogeneity), (in)directness of the evidence (with respect to the question under consideration, including surrogate parameters), (im)precision of the results and publication bias. We assigned all evidence a ‘high’ level of quality when it was based on RCTs. If any of the reasons below applied to the body of evidence, for each comparison–outcome pair the quality level was downgraded by one level (if the reason was classified as serious) or two levels (if the reason was classified as very serious):

  • Risk of bias may arise from limitations in the study design and implementation. We downgraded evidence quality if there was lack of allocation concealment (selection bias), lack of blinding (performance bias), incomplete accounting of patients and outcome events (attrition bias), and other limitations affecting outcome assessment (detection bias).
  • Inconsistency referred to heterogeneity in results, which could arise from differences in populations, interventions or outcomes. Widely differing estimates of the effects across studies suggests that there might be true differences in underlying effect. When heterogeneity existed, but investigators failed to identify a plausible explanation, the quality of evidence was downgraded by one or two levels, depending on the magnitude of the inconsistency in the results.
  • Indirectness referred to broader or more restricted assessment of the review question components including population, intervention, comparator and outcomes.
  • Imprecision of results referred to wide 95% CIs as a result of few participants or few events. We downgraded the quality of evidence because of imprecision if there was a non-significant result or wide CIs.

We tabulated these features and assigned an overall quality grade to the evidence for each comparison–outcome pair. The footnotes in each table (e.g. Table 9) provide an explanation as to how we downgraded evidence in light of various deficiencies (Table 5).

TABLE 5. Criteria for assessing risk of bias.

TABLE 5

Criteria for assessing risk of bias.

The secondary maternal and fetal outcomes critical to clinical care of the patient were prioritised by a two-round Delphi survey of clinicians. The Delphi panel of clinicians was chosen for their interest in the field. A structured list of these outcomes (Box 1) was sent to 20 clinicians along with a covering letter explaining the purpose of this survey. The questionnaire was sent by e-mail and anonymity was maintained between panellists. In the first round, the experts were asked to rank the outcomes for their importance on a 1–9 scale (1–3 not important; 4–6 important, but not critical; 7–9 critical). They were given the opportunity to add outcomes that were considered to be relevant but not included in the list. Summary statistics such as medians and interquartile ranges (IQRs) were generated for each outcome. The median was used to identify the location on the appropriateness scale and an IQR (i.e. a measure of dispersion generated by taking the difference between the 75th and the 25th percentiles) of ≤ 2 was predefined to indicate consensus. In the second round the experts were asked to reconsider their previous ratings in view of the panel score. The new median scores and IQRs were recalculated. The top 10 outcomes were identified for inclusion in the GRADE evidence profile in addition to the primary weight-related outcomes.

Box Icon

BOX 1

List of maternal and fetal outcomes relevant to patient care in the evaluation of weight management interventions in pregnancy. Gestational diabetes mellitus Pre-eclampsia/pregnancy-induced hypertension

The strength of evidence for each outcome was assessed. The main maternal and fetal weight-related outcomes and those prioritised by the Delphi panel were assessed by GRADE methodology using GRADEpro software version 3.2.2 [GRADEpro (computer program), version 3.2 for Windows; Jan Brozek, Andrew Oxman and Holger Schürmann, 2008]. Two reviewers independently assessed the quality of each study; disagreements were resolved by consensus or arbitration involving a third reviewer. For each comparison–outcome pair we deployed a two-dimensional chart plotting five variables represented on equiangular spokes starting from the same point, each spoke representing one of the domains used in evidence grading.43 These included study design, risk of bias, inconsistency, indirectness and imprecision. The data length of a spoke was proportional to the magnitude of the quality, ranging from high to moderate to low to very low. A line connected the data values for each spoke generating a pentagon. Consistent use of the same position and angle of the spokes in all comparison–outcome pairs was used for easy visual interpretation in a multiplot format.

Image ch3f1
© 2012, Crown Copyright.

Included under terms of UK Non-commercial Government License.

Bookshelf ID: NBK109458

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (13M)

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...