3.1. WHO guideline development process
These WHO guidelines were developed following the recommendations for standard guidelines as described in the WHO Handbook for Guideline Development (33), and the GRADE framework (34–37) (, and Box 3.1). A Guidelines Development Group was formed with representation from different geographical regions as well as from a wide range of stakeholders, including researchers, clinicians and programme managers, advocacy groups and members of organizations that represent persons living with chronic hepatitis. There was an initial scoping and planning process to formulate questions most relevant to LMICs and patient-important outcomes (see
Web annex 4 for all PICO questions).
GRADE categories of the quality of evidence.
Key domains considered in determining the strength of recommendations.
Standard approach to rating the quality of evidence and strength of recommendations using the GRADE system.
3.2. Systematic reviews and additional background work
Systematic reviews on diagnostic performance. Systematic reviews and meta-analyses of the primary literature were commissioned externally to address the research questions and patient-important outcomes. For evaluation of HBV and HCV diagnostics and testing strategies, there was very limited or no evidence for patient-important outcomes. The Guidelines Development Group and PICO questions considered diagnostic accuracy (sensitivity, specificity, positive and negative predictive values) and in some cases analytical sensitivity (limit of detection) as surrogates for patient-important outcomes, assuming reasonable linkage and access to care. Search strategies and summaries of evidence are reported in Web annex 5. The glossary provides full definitions for diagnostic and analytical test performance.
As part of the guidelines development process, WHO commissioned other work to provide additional data to support the recommendations. These are given below.
Existing systematic reviews on global and regional seroprevalence of HBsAg and HCV antibody in general population and
specific high-risk populations (
Table 4.1).
Review of the cost–effectiveness literature of different viral hepatitis testing approaches in different settings. The evidence base for different testing approaches remains very limited, especially for impact on patient-important outcomes and in LMICs, and largely relies on observational data and modelling. The limited number of cost–effectiveness studies and the heterogeneity of study populations, testing approaches and outcomes measured precluded a formal systematic review and meta-analysis. A narrative review was therefore undertaken that included studies of: (i) focused or targeted testing of the highest-risk groups; (ii) routine testing among specific birth cohorts that are readily identified and have a high prevalence of HCV infection; and (iii) routine testing throughout the entire population, in different settings.
Predictive modelling of testing strategies (i.e. one- or two-test serological testing strategies). There were very few studies that directly compared different testing strategies for diagnostic accuracy and therefore a predictive modelling analysis was carried out to examine the accuracy of a testing strategy across a range of performance characteristics of the assays (sensitivity and specificity) based on the systematic reviews, and a hypothetical range of prevalence of the disease in the population (10%, 2%, 0.4%) representing high-, medium- and low-prevalence settings or populations (
see
Web annex 6).
Values and preferences survey of health-care workers and implementers for different testing strategies and approaches. A four-part online survey tool was undertaken in September 2015, which covered questions on current and preferences for future HBV and HCV testing practices, including a test of HCV cure. Respondents included clinicians, patient organizations, civil society representatives, programme managers, policy-makers and pharmaceutical industry employees.
Feasibility survey on programmatic experiences and reports of barriers/challenges to HBV and/or HCV testing based on 22 interviewees across 13 LMICs conducted between June and September 2015. The 33-question semi-structured questionnaire covered programme information (who is tested and where, what assays/algorithms are used, counselling and training, funding and costs of testing); protocol for hepatitis care and treatment; perceived barriers/challenges and solutions; and provision of relevant epidemiological data.
Case examples of different models of hepatitis testing practices in different settings and populations were also solicited and identified through a hepatitis testing innovation contest, to illustrate effective and acceptable ways to deliver facility and community-based testing services, especially among most affected populations.
3.3. Grading of quality of evidence and strength of recommendations
The quality of the evidence was assessed and either rated down or rated up based on criteria specified in GRADE methods, modified for diagnostic tests and test strategies (38, 39). Summaries of the quality of evidence to address each outcome were entered in the GRADE profiler software (GRADE pro 3.6). The quality of evidence was categorized as high, moderate, low or very low (Box 3.1 and ).
Specific issues with rating quality of evidence for studies of diagnostic accuracy and strategies
Diagnostic test accuracy. For evaluation of HBV and HCV diagnostics and testing strategies, there was very limited or no evidence on effects on patient-important outcomes. The Guidelines Development Group and PICO questions considered diagnostic accuracy (sensitivity, specificity, positive and negative predictive values) and in some cases analytical sensitivity (limit of detection) as surrogates for patient-important outcomes, assuming reasonable linkage and access to care.
Although observational studies of interventions start as low quality in GRADE, cross-sectional and cohort studies of diagnostic accuracy can provide reliable evidence (38), and were therefore initially categorized as high quality. Evidence was then rated down based on the presence of (i) risk of bias (using a tool designed for assessment of diagnostic accuracy studies, the QUADAS-2 tool) (40); (ii) inconsistency or heterogeneity; (iii) indirectness (addressing a different population than the one under consideration); or (iv) imprecision. However, evaluating inconsistency in studies of diagnostic accuracy is a challenge because methods to measure statistical heterogeneity are lacking and inconsistency is common, and therefore we did not downgrade for indirectness.
Testing strategies. Clinical studies to evaluate comparisons of different testing strategies and approaches were generally not available. Therefore, the Guidelines Development Group considered instead predictive modelling to generate estimates of diagnostic performance of different testing strategies. This type of evidence was not formally graded but was considered low quality because it is very indirect.
3.4. Formulation of recommendations
At the September 2015 meeting of the Guidelines Development Group, for each of the PICO questions (see
Web annex 4), the results of the systematic reviews and the evidence profiles (see
Web annexes 5 and 6) were presented and reviewed. Commissioned surveys of diagnostic costs, values and preferences for different testing strategies of health-care workers and implementing partners, and a global survey of programmatic experience were also considered. Recommendations were then formulated based on the overall quality of the evidence, in addition to other considerations, including the balance between benefits and harms, values and preferences, feasibility and resource implications (). The strength of the recommendations was rated as either strong (the panel was confident that the benefits of the intervention outweighed the risks) or conditional (the panel considered that the benefits of the intervention outweighed the risks, but the balance of benefits to harms and burdens was small or uncertain). Recommendations were then formulated and the wording finalized by the entire Group. Implementation needs were subsequently evaluated, and areas and topics requiring further research identified.
For recommendations based on diagnostic accuracy, the Guidelines Development Group considered potential trade-offs between diagnostic accuracy and other factors. Although diagnostic accuracy was considered a critical outcome and a reasonable surrogate for patient outcomes, tests and testing strategies associated with slightly lower diagnostic accuracy could be recommended when associated with lower costs, increased testing access and linkage to care or greater feasibility.
3.5. Declaration and management of conflicts of interest
In accordance with WHO policy, all members of the Guidelines Development Group and peer reviewers were required to complete and submit a WHO Declaration of Interest form (including participation in consulting and advisory panels, research support and financial investment) and, where appropriate, also provide a summary of research interests and activities. The WHO Secretariat then reviewed and assessed the declarations submitted by each member and, at the September 2015 meeting of the Guidelines Development Group, presented a summary to the Guidelines Development Group (see
Web annex 7). The WHO Secretariat stated that there had been a transparent declaration of financial and academic interests, and concluded that there were no conflicts that required exclusion of any member from actively taking part in formulating the recommendations during the meeting. For the peer review group, the WHO Secretariat was also satisfied that no case necessitated exclusion from the review process.
3.6. Updating, disseminating and monitoring implementation of the guidelines
The guidelines are accessible on the WHO website with links to other related websites, and translated into the official United Nations (UN) languages. WHO disseminates the guidelines to ministries of health in countries, as well as key international, regional and national collaborating partners (e.g. civil society, foundations, donors).
Successful implementation of these guidelines will be assessed by the number of countries that incorporate the contents into national hepatitis plans and guidelines. The impact of the testing guidelines will be measured by monitoring the number of persons tested and treated for chronic hepatitis B and hepatitis C infection, in accordance with targets proposed in the WHO Global health sector strategy on viral hepatitis 2016–2021 (16) (see
Web annex 1). The Guidelines Development Group recognized that the field of hepatitis diagnostics and testing is evolving rapidly, and it is anticipated that there will be a need for periodic updates.