NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Hall PS, Mitchell ED, Smith AF, et al. The future for diagnostic tests of acute kidney injury in critical care: evidence synthesis, care pathway analysis and research prioritisation. Southampton (UK): NIHR Journals Library; 2018 May. (Health Technology Assessment, No. 22.32.)
The future for diagnostic tests of acute kidney injury in critical care: evidence synthesis, care pathway analysis and research prioritisation.
Show detailsIntroduction
In this chapter a meta-analysis of diagnostic accuracy studies is provided to evaluate the body of evidence available for three diagnostic tests of AKI and to provide input into the decision analysis in subsequent chapters. The primary health-care setting considered in the use of these tests was the critical care unit and the secondary health-care setting considered was cardiac surgery (pre and/or post intervention). The three diagnostic tests considered were the Nephrocheck test (which uses a combination of two proteins: TIMP-2 and IGFBP-7) and the biomarkers NGAL and cystatin C. The NGAL and cystatin C tests have been used in this setting for measurement of their concentration in samples of blood serum, blood plasma and urine and these media were considered separately. The pooled estimates of sensitivity and specificity and their variance from the meta-analyses directly informed the decision analysis in Chapter 5. In our searches we identified no previous reviews or meta-analyses considering Nephrocheck or cystatin C in this setting, but we did identify two relevant reviews of NGAL.40,245
Methods
Primary objective
The primary objective was to estimate pooled means and variances of sensitivity and specificity for each of the diagnostic tests considered. When appropriate data were available, these estimates were obtained separately for each of the health-care settings considered and for each of the sample media considered. When only one study was available, no meta-analysis was undertaken.
Identification of studies
Details on the search strategies used to identify studies and the process for study screening and evaluation, data extraction and quality assessment are presented in Chapter 2.
Full papers were retrieved for studies of all patients (< 18 years, ≥ 18 years) in which AKI diagnosis had been evaluated using any one or multiples of the three diagnostic tests considered (Nephrocheck, NGAL, cystatin C), in any of the sample media considered (blood serum, blood plasma or urine), in either of the health-care settings considered (critical care unit or cardiac surgery). Studies were excluded if they were not primarily located in either of these health-care settings.
Study methods
The gold standard for determining AKI diagnosis was defined as diagnosis according to the RIFLE,241 AKIN242 or KDIGO (Kidney Disease: Improving Global Outcomes)238 diagnostic and classification system, based on an assessment of serum creatinine levels and urine output (see Chapter 2).
Outcome measurements
The primary outcomes were sensitivity (the probability of the test being positive given that the true diagnosis is positive) and specificity (the probability of the test being negative given that the true diagnosis is negative), which are determined by comparison of the results of the experimental diagnostic test with the results of the gold standard method used in the study. Studies were excluded if the gold standard method used to determine the outcome was not described in sufficient detail. Studies were not excluded if the cut-off point used to assess the positive and negative status of the outcome in the experimental diagnostic test was not reported.
Diagnostic and staging systems for acute kidney injury
A number of diagnostic and staging systems for AKI have been used in diagnostic accuracy studies. The most commonly used make use of repeated serum creatinine measurements and measurement of urine output to diagnose and stage AKI. Three commonly used systems are the RIFLE, AKIN and KDIGO systems.
RIFLE classification of acute kidney injury
The Acute Dialysis Outcome Initiative group proposed the RIFLE classification, which defines five categories of AKI,241 as shown in Table 8. AKI is staged for severity according to the criteria listed in Table 8, with any classification of risk or above being a diagnosis of AKI.
Acute Kidney Injury Network classification of acute kidney injury
The AKIN has defined diagnostic criteria for AKI and provided a staging system for the severity of AKI,242 as shown in Table 8. AKI is staged for severity according to the criteria listed in Table 8, with any classification of stage 1 or above being a diagnosis of AKI. In particular, in contrast to the RIFLE criteria, the absolute change in serum creatinine defining AKI is defined as an abrupt (within 48 hours) reduction in kidney function as defined by stage 1 or above.
KDIGO classification of acute kidney injury
The 2011 KDIGO Clinical Practice Guideline for AKI (Summary of Recommendation Statements, 2012)246 defined diagnostic criteria for AKI and provided a staging system for the severity of AKI, as shown in Table 8. AKI is staged for severity according to the criteria listed in Table 8, with any classification of stage 1 or above being a diagnosis of AKI. This classification system uses the same time frame for absolute changes as the AKIN criteria and clarifies that for the relative changes the baseline values should be known or presumed to have occurred within the previous 7 days.
Summary of staging methods
There is a similarity between the staging and diagnostic criteria proposed for AKI, which is demonstrated in Table 10. It has been shown that the AKIN criteria can diagnose more patients correctly with AKI than the RIFLE criteria (not unexpected given the additional criterion – the absolute change in serum creatinine level), but it has not been shown to have a better predictive ability for in-hospital mortality.247 It has also been shown that the AKIN criteria do not improve the sensitivity of AKI diagnosis compared with the RIFLE criteria in the first 24 hours after admission to the critical care unit.248 Similarly, it has been shown than a higher incidence of AKI can be diagnosed using the KDIGO criteria than using the RIFLE criteria and that the KDIGO criteria are more predictive for in-hospital mortality, but there was no significant difference between the AKIN criteria and the KDIGO criteria.249 Other studies have suggested that the RIFLE, AKIN and KDIGO criteria are good tools for predicting mortality in critically ill patients and observe no evidence of a difference between them.250
Based on the definitions used in the different diagnostic and staging/classification systems and the evidence above we believe that there are broad similarities between the RIFLE, AKIN and KDIGO criteria and, for the purposes of this study, we defined a diagnosis of AKI, following the KDIGO criteria, as any of the following:
- increase in serum creatinine of ≥ 0.3 mg/dl (≥ 26.5 µmol/l) within 48 hours
- increase in serum creatinine to ≥ 1.5 × baseline, which is known or presumed to have occurred within the previous 7 days
- urine volume < 0.5 ml/kg/hour for at least 6 hours.
Furthermore, in the studies identified for inclusion in the meta-analysis, studies that used either of the outcomes indicated by the shaded areas in Table 10 (RIFLE R, AKIN 1 or KDIGO 1 – a diagnostic-type outcome; RIFLE F, AKIN 3, KDIGO 3 or RRT – a failure-type outcome) were considered homogeneous for the purposes of the meta-analysis.
Key data extracted
The primary data extracted for inclusion in the meta-analysis are shown in Table 11. It is recommended in the STARD statement that a cross-tabulation of the index test results by the results of the reference standard is included in any study report,251,252 but it was anticipated that this information would not be present in all study reports. In this situation the elements of the confusion matrix were calculated using information describing the diagnostic outcomes and estimates of sensitivity and specificity. For example, if the sensitivity (s) and number of true diagnoses [given by the sum of the number of true positives (TPs) and the number of false negatives (FNs), i.e. (TP + FN)] were reported in the study then the number of TPs could be calculated as s.(TP + FN). A similar calculation for specificity (p) allowed the estimation of the number of true negatives (TNs): p.(FP + TN), where FP represents the number of false positives. Finally, given these estimates for TP and TN and the numbers of true outcomes [(TP + FN) and (FP + TN)], simple subtraction provided estimates for FN and FP.
Study exclusion
Studies were excluded from the meta-analysis if it was not possible to estimate values for the elements of the confusion matrix or if other key data could not be extracted. Further reasons for the exclusion of studies were if diagnosis was carried out in the emergency department rather than in the critical care unit and if the biomarker was measured on a relative scale rather than an absolute scale, for example unit of biomarker per unit of serum creatinine.
Data analysis
Simple diagnostic accuracy summaries [sensitivity, specificity and the diagnostic odds ratio (DOR) and its components – positive likelihood ratio (LR+) and negative likelihood ratio (LR–)] were produced for each study included in the meta-analysis. The sensitivity of a diagnostic test (T) is defined formally as the probability that the test will give a positive result if the patient has the disease (D+), in this case AKI. This is often referred to as the TP rate for a diagnostic test and can be expressed as a conditional probability:
The specificity of a diagnostic test is the probability that the test will give a negative result if the patient does not have the disease (D–), which is equivalent to 1 minus the FP rate for the test and can be expressed as the conditional probability:
Confidence intervals (CIs) were estimated for sensitivity and specificity based on the Wilson score interval method.253
The LR+ of a diagnostic test is the probability of a patient with disease (D+) having a positive test result divided by the probability of a patient without disease (D–) having a positive test result:
Similarly, the LR– of a diagnostic test is the probability of a patient with disease having a negative test result divided by the probability of a patient without disease having a negative test result:
Confidence intervals for the LR+ and LR– were estimated using the method of Koopman.254
The DOR for a test is the ratio of the odds of a positive test result for a patient with disease relative to the odds of a positive test result for a patient without disease:
Confidence intervals for log(DOR) were estimated based on the assumption that, as an odds ratio, the DOR is normally distributed. Estimates for DOR were then obtained by back-transformation.
The method of meta-analysis for diagnostic accuracy studies used here was the bivariate meta-analysis proposed by Reitsma et al.,255 based on the methodology of van Houwelingen et al.256 Briefly, if logit sensitivity (µsi) and logit specificity (µpi) are
and
for each study i (with k studies included in the meta-analysis), the true logit sensitivity and logit specificity are then assumed to have a bivariate normal distribution across studies:
where is the covariance between logit sensitivity and logit specificity. This model is extended by incorporating the variability due to sampling through the variance of sensitivity () and specificity (), as measured in each study:
and
assuming that 0 < p and s < 1 and that the number of subjects used to estimate sensitivity and specificity is large.256 The final model is then a bivariate random-effects model of the form:
This model was estimated using likelihood-based methods using the mada package257 in the R Environment for Statistical Computing (The R Foundation for Statistical Computing, Vienna, Austria). It has been shown that this method is equivalent to the hierarchical regression meta-analysis proposed by and further developed by Rutter and Gatsonis when there are no study-level covariates.258–260
Separate meta-analyses were conducted for each diagnostic test, sample media and health service setting. Pooled estimates of sensitivity, specificity, LR+, LR– and DOR can be estimated from back-transformed parameter estimates. Estimates from each study and pooled estimates from the meta-analysis are presented in forest plots. A summary receiver operating characteristic (SROC) curve was estimated, with estimates of the confidence and prediction region.255,259 Approximate estimates of the variance of sensitivity and specificity for use in the economic model were determined using the delta method.261
Tests of heterogeneity were not used, as such statistical methods (Cochran’s Q, I2) do not account for heterogeneity explained by phenomena such as positivity threshold effects and are not recommended by the Cochrane Diagnostic Test Accuracy Group.262 Estimating the prediction region in the SROC curve is one way of examining the extent of heterogeneity by depicting a region within which, assuming that the model is correct, we have 95% confidence that the true sensitivity and specificity of a future study would lie.260
Results
Papers selected for inclusion in the meta-analysis are described briefly in tabular summaries followed by a summary of diagnostic accuracy for each study. Pooled estimates of sensitivity and specificity and the SROC curve are also provided for each diagnostic test, health-care setting and sample type. A Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram that depicts the flow of information through the different phases of the systematic review to data extraction and final inclusion in the meta-analysis is shown in Chapter 2 (see Figure 2).
Nephrocheck
Critical care unit: plasma and serum
The searches identified no studies suitable for data extraction for this diagnostic test in either health-care setting for either plasma or serum.
Critical care unit: urine
Summaries of the baseline characteristics and test parameters for the included urinary Nephrocheck studies are shown in Table 12. Three studies were included,33,44,45 with a total of 1289 patients [199 patients (15.4%) with a diagnosis of AKI and 1090 patients (84.6%) without a diagnosis of AKI]. The sample for use in the test was taken on enrolment in all of the included studies. The outcome used to define the presence of AKI was consistent across each of the three studies (KDIGO stage 2 or 3). Similarly, the threshold used to define a positive test was consistent in all included studies [(TIMP-2) × (IGFBP-7) = 0.3]. Diagnostic accuracy summaries for the included studies are shown in Table 13.
Figure 3 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for Nephrocheck in the critical care unit using patient urine samples. The pooled sensitivity estimate was 0.90 (95% CI 0.85 to 0.93) and the pooled specificity estimate was 0.49 (95% CI 0.46 to 0.53). Figure 4 shows an estimate of the SROC curve with the 95% confidence region and 95% prediction region. The prediction and confidence regions are small, suggesting limited heterogeneity. This is to be expected given the highly controlled similarity in these studies.
Cardiac surgery: urine
One study including 50 patients [26 patients (52.0%) with a diagnosis of AKI and 24 patients (48.0%) without a diagnosis AKI] was identified for the use of Nephrocheck in the cardiac surgery setting using urine samples.54 A summary of the baseline characteristics and test parameters for the included study is shown in Table 14. The outcome used to define a diagnosis of AKI was a RIFLE classification of ≥ R within 72 hours of surgery. The threshold used to define a positive test was whether the maximum (TIMP-2) × (IGFBP-7) value in the 24 hours post cardiopulmonary bypass (CPB) was > 0.3. A diagnostic accuracy summary for the included study is shown in Table 15. No meta-analysis was performed for this single study.
Neutrophil gelatinase-associated lipocalin
Critical care unit: plasma
Summaries of the baseline characteristics and test parameters for the included studies are shown in Table 16. Eight studies were included,35,43,48,49,51,59,66,100 with a total of 1670 patients [381 patients (22.8%) with a diagnosis of AKI and 1289 patients (77.2%) without a diagnosis of AKI]. The outcome used to define the presence of AKI was not consistent across the studies, with six studies35,43,49,59,66,100 using a diagnostic outcome and two studies48,51 using a failure-type outcome. There was also heterogeneity in the time at which the outcome assessment occurred (unclear, 48 hours and 7 days). Similarly, the threshold used to define a positive test was not consistent across the studies, ranging from 242 pg/ml49 to 558 ng/ml.43 Diagnostic accuracy summaries for the included studies are provided in Table 17.
Figure 5 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for NGAL in the critical care unit using patient plasma samples. The pooled sensitivity estimate was 0.72 (95% CI 0.65 to 0.79) and the pooled specificity estimate was 0.81 (95% CI 0.75 to 0.86). Figure 6 shows an estimate of the SROC curve with the 95% confidence region and 95% prediction region. The prediction interval shows that there is a degree of heterogeneity in sensitivity and specificity, probably reflecting the variability in outcome measures and cut-off points, as well as other unidentified sources of heterogeneity.
Critical care unit: serum
One study including 150 patients [43 patients (28.7%) with a diagnosis of AKI and 107 patients (71.3%) without a diagnosis of AKI] was identified for NGAL in the critical care unit setting using serum.34 A summary of the baseline characteristics and test parameters for the included study is provided in Table 18. A diagnostic-type outcome was used to define the presence of AKI (AKIN stage 1 or above within 48 hours of admission). The threshold used to define a positive test was 110 ng/ml. A diagnostic accuracy summary for the included study is provided in Table 19. No meta-analysis was performed for this single study.
Critical care unit: urine
Summaries of the baseline characteristics and test parameters for the included studies are shown in Table 20. Six studies were included,32,34,35,43,48,100 with a total of 1194 patients [283 patients (23.7%) with a diagnosis of AKI and 911 patients (76.3%) without a diagnosis of AKI]. The outcome used to define the presence of AKI was not consistent across the studies. Of the studies included, five had a similar end point,32,34,35,43,48 being at least the least severe stage of the RIFLE, AKIN or KDIGO classification system. There was heterogeneity in the time up to which the outcome assessment occurred (from 28 hours up to the end of the hospital stay). Similarly, the threshold used to define a positive test was not consistent across the studies, ranging from 29.5 ng/ml32 to 1310 ng/ml.100 Diagnostic accuracy summaries for the included studies are shown in Table 21.
Figure 7 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for NGAL in the critical care unit using patient urine samples. The pooled sensitivity estimate was 0.70 (95% CI 0.59 to 0.80) and the pooled specificity estimate was 0.79 (95% CI 0.71 to 0.86). Figure 8 shows an estimate of the SROC curve with the 95% confidence region and 95% prediction region. The prediction region shows that there is a degree of heterogeneity in sensitivity and specificity, probably reflecting the variability in outcome measures and cut-off points, as well as other unidentified sources of heterogeneity.
Cardiac surgery: plasma
Summaries of the baseline characteristics and test parameters for the included studies are shown in Table 22. Eight studies were included,31,40,47,60–63,67 with a total of 2644 patients [286 patients (10.8%) with a diagnosis of AKI and 2358 patients (89.2%) without a diagnosis of AKI]. The outcome used to define a diagnosis of AKI was largely consistent across the studies, although there was some heterogeneity in the time period in which the outcome was assessed. However, one study, used an end point that could not easily be mapped to the considered criteria.67 Each of these could be considered to be somewhere between the least two severe categories of the RIFLE, AKN or KDIGO classification system. Similarly, the threshold used to define a positive test was not consistent across the studies, ranging from 150 ng/ml31 to 426 ng/ml.67 Diagnostic accuracy summaries for the included studies are shown in Table 23.
Figure 9 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for NGAL in the cardiac surgery setting using patient plasma samples. The pooled sensitivity estimate was 0.62 (95% CI 0.49 to 0.74) and the pooled specificity estimate was 0.78 (95% CI 0.75 to 0.81). Figure 10 shows an estimate of the SROC curve with the 95% confidence region and 95% prediction region. The prediction region suggests limited heterogeneity when considering specificity, with far greater heterogeneity when considering sensitivity, confirming the observations that can be made from the forest plot (see Figure 9).
Cardiac surgery: serum
Summaries of the baseline characteristics and test parameters for the included studies are shown in Table 24. Two studies were included,38,68 with a total of 239 patients [53 patients (22.2%) with a diagnosis of AKI and 186 patients (77.8%) without a diagnosis of AKI]. The outcome used to define the presence of AKI was not consistent across the studies, with one study38 using a less stringent version of the AKIN stage 1 classification without justification. Similarly, the threshold used to define a positive test was not consistent, being 0.62 ng/ml in one study38 and 133.7 ng/ml in the other study.68 Diagnostic accuracy summaries for the included studies are shown in Table 25.
Figure 11 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for NGAL in the cardiac surgery setting using patient serum samples. The pooled sensitivity estimate was 0.84 (95% CI 0.43 to 0.97) and the pooled specificity estimate was 0.87 (95% CI 0.59 to 0.97). Figure 12 shows an estimate of the SROC curve with the 95% confidence region and 95% prediction region. There appears to be considerable heterogeneity in both sensitivity and specificity.
Cardiac surgery: urine
Summaries of the baseline characteristics and test parameters for the included studies are shown in Table 26. Thirteen studies were included,41,50,52,53,56,58,60,64,65,67,69,70,72 with a total of 3226 patients [444 patients (13.8%) with a diagnosis of AKI and 2782 patients (86.2%) without a diagnosis of AKI]. The outcome used to define the presence of AKI was largely consistent across the studies, with one study67 using a non-standard definition and one study41 using only the serum creatinine assessment tool of the AKIN criteria. However, there was heterogeneity in the time to outcome assessment across the studies. The threshold used to define a positive test was not consistent across all of the studies, with some studies using raw concentration values and concentrations normalised by units of urine creatinine. Diagnostic accuracy summaries for the included studies are shown in Table 27.
Figure 13 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for NGAL in the cardiac surgery setting using patient urine samples. The pooled sensitivity estimate was 0.66 (95% CI 0.54 to 0.76) and the pooled specificity estimate was 0.62 (95% CI 0.41 to 0.79). Figure 14 shows an estimate of the SROC with the 95% confidence region and 95% prediction region. The prediction interval covers almost all of the SROC space, suggesting that there is considerable heterogeneity between the studies included in this meta-analysis.
Cystatin C
Critical care unit: plasma
Summaries of the baseline characteristics and test parameters for the included studies are shown in Table 28. Three studies were included,32,48,49 with a total of 362 patients [140 patients (38.7%) with a diagnosis of AKI and 222 patients (61.3%) without a diagnosis of AKI]. The outcome used to define the presence of AKI was not consistent across the studies. All of the studies can be considered to have used a similar end point, but there was heterogeneity in the time up to which the outcome assessment occurred (7 days post study entry up to during the hospital stay). Similarly, the threshold used to define a positive test was not consistent across the studies, ranging from 1040 ng/ml48 to 1500 ng/ml.32 Diagnostic accuracy summaries for the included studies are shown in Table 29.
Figure 15 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for cystatin C in the critical care unit using patient plasma samples. The pooled sensitivity estimate was 0.72 (95% CI 0.59 to 0.82) and the pooled specificity estimate was 0.74 (95% CI 0.65 to 0.81). Figure 16 shows an estimate of the SROC curve with the 95% confidence region and 95% prediction region. Examining the forest plot and prediction region suggests that there is greater heterogeneity in sensitivity than in specificity.
Critical care unit: serum
Summaries of the baseline characteristics and test parameters for the included studies are shown in Table 30. Four studies were included,34,42,46,71 with a total of 372 patients [110 patients (29.6%) with a diagnosis of AKI and 262 patients (70.4%) without a diagnosis of AKI]. The outcome used to define the presence of AKI was not consistent across the studies, with one study using a definition that was less serious than the least serious stage of the RIFLE, AKIN or KDIGO classification system46 and one study basing the outcome on continuous urine collection.71 Similarly, the threshold used to define a positive test was not consistent across all of the studies, ranging from absolute values of 1200 ng/ml46 to 1800 ng/ml34 and with other patient-specific relative thresholds used. Diagnostic accuracy summaries for the included studies are shown in Table 31.
Figure 17 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for cystatin C in the critical care unit using patient serum samples. The pooled sensitivity estimate was 0.76 (95% CI 0.57 to 0.88) and the pooled specificity estimate was 0.91 (95% CI 0.85 to 0.95). Figure 18 shows an estimate of the SROC curve with the 95% confidence region and 95% prediction region, which again shows greater heterogeneity in sensitivity than in specificity.
Critical care unit: urine
Summaries of the baseline characteristics and test parameters for the included studies are shown in Table 32. Three studies were included,32,34,175 with a total of 745 patients [231 patients (31.0%) with a diagnosis of AKI and 514 patients (69.0%) without a diagnosis of AKI]. The definition of AKI used was fairly consistent across the studies, but the time period within which the outcome was assessed varied from 48 hours to the entire length of the hospital stay. Similarly, the threshold used to define a positive test was not consistent across the studies, ranging from 106 ng/ml32 to 200 ng/ml.34 Diagnostic accuracy summaries for the included studies are shown in Table 33.
Figure 19 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for cystatin C in the critical care unit using patient urine samples. The pooled sensitivity estimate was 0.68 (95% CI 0.43 to 0.86) and the pooled specificity estimate was 0.76 (95% CI 0.62 to 0.86). Figure 20 shows an estimate of the SROC curve with the 95% confidence region and 95% prediction region. Heterogeneity appears to be considerable and greater in terms of sensitivity than specificity.
Cardiac surgery: plasma
The searches identified no studies suitable for data extraction for this diagnostic test, setting and sample type.
Cardiac surgery: serum
Summaries of the baseline characteristics and test parameters for the included studies are shown in Table 34. Five studies were included,31,38,39,64,68 with a total of 532 patients [147 patients (27.6%) with a diagnosis of AKI and 385 patients (72.4%) without a diagnosis of AKI]. The outcome used to define the presence of AKI was not consistent across the studies, with one study38 using a definition that was less serious than the least serious stage of the RIFLE, AKIN or KDIGO classification system17 and one study basing the outcome on continuous urine collection (from 48 hours up to 4 days post study entry). Similarly, the threshold used to define a positive test was not consistent across the studies, ranging from 0.0265 ng/ml (26.5 pg/ml) to 1100 ng/ml. Diagnostic accuracy summaries for the included studies are shown in Table 35.
Figure 21 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for cystatin C in the cardiac surgery setting using patient serum samples. The pooled sensitivity estimate was 0.73 (95% CI 0.65 to 0.80) and the pooled specificity estimate was 0.72 (95% CI 0.63 to 0.79). Figure 22 shows an estimate of the SROC curve with the 95% confidence region and 95% prediction region. The studies show limited evidence of heterogeneity, with a greater degree of heterogeneity with respect to specificity than sensitivity.
Cardiac surgery: urine
Summaries of the baseline characteristics and test parameters for the included studies are shown in Table 36. Two studies were included,50,69 with a total of 908 patients [131 patients (14.4%) with a diagnosis of AKI and 777 patients (85.6%) without a diagnosis of AKI]. The outcome used to define the presence of AKI was similar in both studies, although the time frame over which the outcome was assessed varied by 24 hours. The threshold used to define a positive test was not consistent across the studies, with one study69 not clearly reporting the threshold used. Diagnostic accuracy summaries for the included studies are shown in Table 37.
Figure 23 shows point estimates of the sensitivity and specificity from individual studies and the pooled estimates for cystatin C in the cardiac surgery setting using patient serum samples. The pooled sensitivity estimate was 0.52 (95% CI 0.27 to 0.76) and the pooled specificity estimate was 0.72 (95% CI 0.36 to 0.92). Figure 24 shows an estimate of the SROC curve with the 95% confidence region and 95% prediction region. Examining the forest plot and prediction region in the SROC curve suggests that there is considerable heterogeneity between the studies in terms of sensitivity and specificity.
Limitations
A limitation of this work is the use of a number of similar criteria for diagnosing AKI based on the measurement of serum creatinine and urine output rather than a more direct and accurate determination of kidney function and injury. Criteria based on serum creatinine lack real-time sensitivity for kidney injury, as creatinine concentration has a slow rate of change and is affected by other factors such as sex and muscle mass. Current standard AKI criteria based on serum creatinine and urine output measures therefore represent an imperfect reference test for the early detection of AKI. Each of the studies considered in these meta-analyses used criteria based on changes in serum creatinine concentrations and is therefore affected equally by this limitation.
The method of meta-analysis used in this analysis is recommended by the Cochrane Screening and Diagnostic Tests Methods Group.262 However, a similar method in the Bayesian paradigm, which has shown to be equivalent in simple cases, may have been a reasonable alternative.259 Furthermore, this fully Bayesian approach has the advantage of potentially unifying the sensitivity analyses (assessment of uncertainty in the estimates in decision analysis) with the modelling step and of allowing predictions of test accuracies in future trials through a posterior predictive distribution. The hierarchical model approach may also have been flexible enough to include further modelling aspects related to analytical and biological variance of the diagnostic tests considered in this work, if enough studies were included to reasonably estimate models. This is not possible using the simple bivariate random-effects meta-analysis method and is a possible limitation of this work. Further work will consider in more detail the possibility of extending the hierarchical model to include these aspects and investigate if the hierarchical model can be estimated in meta-analyses of this size. Extension of this model that allows for imperfect gold reference standards may also be worthy of further investigation.263
There is evidence of considerable heterogeneity in some of the included studies, which is clearly observed in the large prediction regions in the SROC space. Two of the sources of heterogeneity are the outcome measures used and the time within which the outcome is assessed. As mentioned earlier, a limitation of this work is the use of criteria based on measurement of serum creatinine and urine output rather than a more direct determination of kidney function. If more studies had provided data, meta-regression may have been useful for isolating and quantifying some of these sources of heterogeneity further. Further investigations may be conducted into modelling these sources of heterogeneity as part of future work described above related to investigating the possible extension of the hierarchical regression models.
Summary
A number of the diagnostic tests for AKI considered in these meta-analyses may have a role to play in certain health-care settings and using particular sample media. The Nephrocheck test using urine in the critical care unit setting appears overall to have the best sensitivity, albeit with low specificity. The estimates of sensitivity are high and there is low heterogeneity. The NGAL test using plasma shows moderate sensitivity and high specificity, but greater heterogeneity. Other health-care settings and sample types show evidence of considerable heterogeneity between studies. Two studies that were included in previously published NGAL meta-analyses were excluded here as they included patients who originated in the emergency room and who were subsequently released to other hospital departments.245,264
- Meta-analysis of diagnostic tests for acute kidney injury - The future for diagn...Meta-analysis of diagnostic tests for acute kidney injury - The future for diagnostic tests of acute kidney injury in critical care: evidence synthesis, care pathway analysis and research prioritisation
- Assessment of clinical effectiveness - A systematic review and economic evaluati...Assessment of clinical effectiveness - A systematic review and economic evaluation of new-generation computed tomography scanners for imaging in coronary artery disease and congenital heart disease: Somatom Definition Flash, Aquilion ONE, Brilliance iCT and Discovery CT750 HD
- Health economic results - Saline in Acute Bronchiolitis RCT and Economic evaluat...Health economic results - Saline in Acute Bronchiolitis RCT and Economic evaluation: hypertonic saline in acute bronchiolitis – randomised controlled trial and systematic review
- Methods - Preoperative intravenous iron for anaemia in elective major open abdom...Methods - Preoperative intravenous iron for anaemia in elective major open abdominal surgery: the PREVENTT RCT
- List of abbreviations - A systematic review and economic evaluation of new-gener...List of abbreviations - A systematic review and economic evaluation of new-generation computed tomography scanners for imaging in coronary artery disease and congenital heart disease: Somatom Definition Flash, Aquilion ONE, Brilliance iCT and Discovery CT750 HD
Your browsing activity is empty.
Activity recording is turned off.
See more...