U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Selph SS, Skelly AC, Jungbauer RM, et al. Cervical Degenerative Disease Treatment: A Systematic Review [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2023 Nov. (Comparative Effectiveness Review, No. 266.)

Cover of Cervical Degenerative Disease Treatment: A Systematic Review

Cervical Degenerative Disease Treatment: A Systematic Review [Internet].

Show details

3Results

3.1. Description of Included Studies

A total of 4,705 references from electronic database searches and reference lists were reviewed. After dual review of titles and abstracts, 1,576 papers were selected for full-text review, of which 1,436 articles were excluded. Of the 114 studies in 140 publications included across all Key Questions, 57 (in 82 publications) were randomized controlled trials (RCTs), 56 (in 57 publications) were observational studies, and one was a systematic review (Figure 2). Results are arranged by Key Question, then by outcome, and are summarized below, followed by tables in the accompanying text.

A list of excluded studies with reason for exclusion are in Appendix E. Data abstraction of study characteristics and results, quality assessment for all included studies, and details for grading strength of evidence (SOE) are available in Appendixes C, D, and G, respectively, while Appendix H includes all appendix references.

Most studies were rated moderate risk of bias. For these studies we do not call their risk of bias in the text. Instead we call out studies that were rated high risk of bias as additional caution should be exercised when interpreting study results.

Figure 2 is a literature flow diagram depicting the search and selection of articles for the review. The diagram shows 4,705 citations were identified through literature database searches and reference lists. 1,576 articles were reviewed at the full-text level after excluding 3,129 abstracts. From the full-text articles reviewed, 1,436 were excluded from this review for the following reasons: ineligible population (71), ineligible intervention (111), ineligible comparison (327), ineligible outcome (76), ineligible study design (109), ineligible publication type (157), sample size (76), systematic review used as source document (190), outdated systematic review (33), foreign language (27), cohort study with no confounding adjustment (233), and observational study with sufficient RCT evidence (26). After excluding these studies, 114 studies in 140 publications were included that provide evidence for the Key Questions.

Figure 2

Literature flow diagram. KQ = Key Question, RCT = randomized controlled trial. 57 (in 82 publications) were randomized controlled trials (RCTs), 56 (in 57 publications) were observational studies, and one was a systematic review.

3.2. Key Question 1: In patients with radiographic spinal cord compression and no cervical spondylotic myelopathy, what are the comparative effectiveness and harms of surgery compared to non-operative treatment or no treatment?

No studies met eligibility criteria for Key Question 1.

3.3. Key Question 2: In patients with radiographic spinal cord compression and mild to severe myelopathy, what are the effectiveness and harms of surgery versus non-operative treatment or no treatment? How do the effectiveness and harms vary by level of severity of myelopathy at the time of surgery?

3.3.1. Key Findings

  • Evidence from one small RCT and one small nonrandomized study of interventions (NRSI) was inadequate to determine the benefits and harms of surgery versus conservative treatment for cervical myelopathy (SOE: Insufficient).

3.3.2. Description of Included Studies

One RCT (N=68) described in three publications2123 and one NRSI (N=80)24 compared surgery versus conservative treatment for cervical myelopathy (Appendix C). The duration of followup in the RCT was 3 years21,22 and 10 years.23 The duration of followup in the NRSI was 3 years.24 In the NRSI, patients were stratified by degree of myelopathy (mild and moderate versus severe) in both the surgery and conservative treatment groups. In the RCT, all patients had slowly or nonprogressing mild to moderate myelopathy. The RCT was conducted in the Czech Republic and received government funding; the NRSI was conducted in Italy and did not report funding.

The mean age of participants was 53 years with 29 percent females in the RCT and 66 years with 44 percent female in the NRSI. The duration of disease was 2 years (range 0.3 to 12 years) in the RCT and the mean duration of symptoms was 25 months (range 3 to 57 months) in the NRSI.

Surgery consisted of anterior decompression (N=22) with bone graft (N=20), corpectomy (N=6), and laminoplasty (N=5) in the RCT. An anterior approach was used in 1- or 2-level cord compression and a posterior approach was used in multilevel spinal stenosis. Surgery consisted of microsurgical anterior corpectomy, discectomy, use of titanium mesh and anterior plating in the NRSI. For 3- or multi-level corpectomy, posterior stabilization was also performed. Surgical patients wore a cervical collar for 4 weeks postoperatively. In the RCT, conservative treatment consisted of cervical collar, anti-inflammatory medication, and bed rest. However, surgical patients also received these treatments. Conservative treatment in the NRSI was similar to treatments in the RCT, but also included physiotherapy.

The RCT was rated moderate risk of bias due to lack of blinding and unclear randomization methods (Appendix D). The NRSI was rated high risk of bias due to unclear differences in patient baseline characteristics across groups and potential selection bias in treatment given (Appendix D). The strength of evidence for neurologic and general function was rated insufficient due to conflicting evidence from two small studies (Appendix G).

3.3.3. Detailed Analysis

3.3.3.1. Fusion

No studies reported fusion outcomes.

3.3.3.2. Pain

No studies reported pain outcomes.

3.3.3.3. Neurologic Function

Evidence from one small RCT and one small NRSI was inadequate to determine the benefits and harms of surgery versus conservative treatment on neurologic function in patients with cervical myelopathy (SOE: Insufficient).

In the RCT, patients were considered to be responders if Modified Japanese Orthopaedic Association Scale (mJOA) scores (maximum 18 points) were improved or unchanged following treatment.22 The likelihood of mJOA response was slightly less with surgery compared with conservative therapy at 6 months (N=66, 61% vs. 73%, relative risk [RR] 0.83, 95% confidence interval (CI) 0.59 to 1.18) and at 36 months (N=59, 59% vs. 73%, RR 0.80, 95% CI 0.55 to 1.16), although differences were not statistically significant. However, mean mJOA scores were not different between surgery and conservative treatment at 6, 12, 24, and 36 months after controlling for baseline values. Ten-year followup of the RCT (N=47) also found no differences between treatment groups on the mJOA (14 vs. 15, p=0.114).23

In the NRSI, patients were divided into four groups (N=20 patients per group) and followed for 3 years: patients with mild to moderate myelopathy treated with surgery; patients with mild to moderate myelopathy treated conservatively; patients with severe myelopathy treated with surgery; patients with severe myelopathy treated conservatively.24 Mild to moderate myelopathy was defined as a mJOA score of 12 and above, severe myelopathy as a score below 12. Patients with severe myelopathy experienced a longer duration of symptoms (40 months) than patients with mild to moderate disease (10 months) and were more likely to receive multilevel surgery than surgical patients with mild to moderate disease. Mean mJOA scores improved over time for both surgery and conservative treatment but favored surgery at 12 and 36 months in patients with mild to moderate myelopathy (12 months mJOA: 15.4 vs. 14.2, p=0.03; 36 months: 16.1 vs. 15.2, p=0.013). In patients with severe myelopathy improvement in mJOA scores was greater with surgery compared with conservative treatment beginning at 6 months (6 months mJOA: 9.5 vs. 7.9, p=0.045; 12 months: 11.5 vs. 8.6, p=0.001; 36 months: 12.45 vs. 8.65, p<0.001).

3.3.3.4. General Function

Evidence from one small RCT and one small NRSI was inadequate to determine the benefits and harms of surgery versus conservative treatment on general function in patients with cervical myelopathy (SOE: Insufficient).

The time required to complete the 10-meter Walk Test in the RCT (N=66) increased over time through 24 months in patients treated with surgery (baseline: 7.9 seconds; 6 months: 8.7 sec; 12 months: 9.9 sec; 24 months: 11.7 sec; 36 months: 9.4 sec), whereas there was little change in time needed to complete the 10-meter walk throughout the followup period with conservative treatment (baseline: 7.4 sec; 6 months: 7.2 sec; 12 months: 7.4 sec; 24 and 36 months: 7.5 sec).21 These differences in walk time between treatments were statistically significant (p-value range 0.034 to 0.003), although the differences between groups is not likely clinically meaningful. Ten-year followup of the RCT (N=47) found no differences on the 10-meter Walk Test (7.3 seconds vs. 7.1 seconds, p=0.207).23 There was no difference, however, in the NRSI, between treatment with surgery versus conservative therapy on the 10-meter Walk Test in patients with mild to moderate myelopathy, whereas there was greater improvement on the 10-Meter Walk Test with surgery in patients with severe myelopathy at 12 and 36 months (12 months: 11.4 seconds vs. 14.4 seconds, p=0.005; 36 months: 10.30 seconds vs. 14.10 seconds, p=0.002).24

In the RCT, patients were videoed performing activities of daily living (ADL) such as buttoning a shirt, brushing teeth and hair, walking, going up and down stairs, and running and were evaluated by blinded observers on a 7-point improvement scale that ranged from 3 (excellent) to −3 (poor); 0 represented no change in ability.21 Patients treated with surgery showed a greater likelihood of improvement in ADLs compared with conservative treatment at 6 months (20% vs. 5.9%) but there was also a greater likelihood of worsening in ADLs with surgery (20% vs. 8.8%) at 6 months. There were no differences between treatments in changes in ADL abilities at 12, 24, or 36 months. Video evaluation of decreased ability to perform ADLs was also not different between treatment groups at 10 years (mean of two evaluators: 56.8% vs. 50%, p>0.05).23 However, with the limited sample size available, this 10-year followup was likely underpowered to demonstrate a difference between surgery and conservative treatment.

Although more patients in the RCT reported that their disease course had improved after surgery compared with conservative therapy at 6 months posttreatment (61% vs. 20%, p=0.001), self-perception of improved diseased course deteriorated over time in the surgery group (p=0.019 for negative trend) and was 20 percent at 36 months compared with a relatively stable course with conservative treatment.21 Ten-year followup of the RCT (N=47) found no difference between treatment groups on a subjective evaluation of worsened status (45.5% vs. 56%, p=0.47).23

The physical component summary score (PCS) and the mental component summary score (MCS) on the 12-Item Short Form Health Survey (SF-12) were not different posttreatment (unclear posttreatment time) in patients with mild to moderate myelopathy who received surgery compared with patients who received conservative therapy (PCS: 37.4 vs. 37.95, p=0.75; MCS: 47.5 vs. 46.7, p=0.78).24 However, improvement in scores was greater with surgery versus conservative treatment in patients with severe myelopathy (PCS: 53.3 vs. 26.85, p<0.001; MCS: 61.2 vs. 31.4, p<0.001).

3.3.3.5. Quality of Life

No studies reported quality of life outcomes.

3.3.3.6. Harms

The NRSI reported that two patients with severe myelopathy who received conservative treatment demonstrated progressive neurological worsening (defined as a worsening of 1 point on the mJOA).24 Surgical complications in this study included 5/40 patients (12.5%) who experienced airway obstruction, graft displacement, and/or wound hematoma. There were no deaths.

The findings of the NRSI, particularly the findings in patients with severe myelopathy, should be interpreted with caution as the individuals in the severe myelopathy group who received conservative treatment consisted of those who refused surgery against medical advice, which may have introduced selection bias.

3.4. Key Question 3: In patients with cervical degenerative disease, what are the comparative effectiveness and harms of surgical compared to non-operative treatment?

3.4.1. Key Findings

  • There was inadequate evidence from one small RCT on the comparative effectiveness of anterior cervical discectomy and fusion (ACDF), physiotherapy, and treatment with a cervical collar on pain and function in patients with cervico-brachial pain without spinal cord compression (SOE: Insufficient).

3.4.2. Description of Included Studies

One RCT (N=81) described in two publications25,26 compared treatment for cervico-brachial pain with cervical decompression and fusion, physiotherapy, or neck collar (Appendix C). All patients had nerve root compression on magnetic resonance imaging (MRI) without spinal cord compression, a history of pain for 3 or more months, and were followed for 16 months. The study was conducted in Sweden.

The mean age of participants was 47 years and 46 percent were female; race or ethnicity were not reported. The worst affected level was C5-C6 (49%) followed by C6-C7 (37%). Prior treatments included physiotherapy (85%; physiotherapy uses a hands on approach to healing, e.g., massage, fascial releases, whereas physical therapy uses hands-on methods but also incorporates physical exercises and use of a cervical collar (42%). Mean duration of pain was 34 months (range 5 to 120 months).

Surgery consisted of ACDF using the Cloward technique and fusion achieved with purified cow bone graft; one patient received a posterior laminectomy. Surgical patients sometimes wore a collar for 1 to 2 days postoperatively. Physiotherapy included traction (70%), strengthening exercises (56%), stretching exercises (56%), massage (33%), heat (33%), and transcutaneous electrical stimulation (22%), among other modalities. Patients treated with cervical collars used a rigid collar during the day and an optional soft collar at night for 3 months.

The trial was rated moderate risk of bias due to lack of blinding and overlap in treatments after 16 weeks (Appendix D). The strength of evidence for pain, neurologic function and general function was rated insufficient due to limited evidence from one small trial (Appendix G).

3.4.3. Detailed Analysis

3.4.3.1. Fusion

No studies reported fusion outcomes.

3.4.3.2. Pain

There was inadequate evidence from one small RCT on the comparative effectiveness of ACDF, physiotherapy and treatment with a cervical collar on pain in patients with cervico-brachila pain without spinal cord compression (SOE: Insufficient).

There were no differences between treatments in current pain or worst pain using the visual analogue scale (VAS) (0-100) at baseline.25 At 14 to 16 weeks followup patients treated with surgery experienced less “current” pain that patients treated with a collar (N=54, 0-100 VAS: 27 vs. 48, p<0.01), but there was no difference between surgery, physiotherapy, and use of a collar in “current” pain at 16 months (N=81, VAS: 30 vs. 39 vs. 35, p>0.05). Results were similar regarding “worst” pain with surgical patients experiencing less “worst” pain than collar patients at 14-16 weeks (N=54, VAS: 43 vs. 64, p<0.001) but no differences in “worst” pain between treatments at 16 months (N=81, VAS: 42 vs. 53 vs. 52, p>0.05, respectively).

3.4.3.3. Function

3.4.3.3.1. Neurological Function

There was inadequate evidence from one small RCT on the comparative effectiveness of ACDF, physiotherapy and treatment with a cervical collar on neurologic function in patients with cervico-brachila pain without spinal cord compression (SOE: Insufficient).

Specific muscle strength before and after treatment was also assessed.26 Patients in the surgery group experienced greater improvements in muscle strength (strength expressed as the ratio of the affected to the unaffected side) at 14 to 16 weeks in pinch grip, elbow extension and shoulder internal rotation compared with patients treated with physiotherapy and greater improvements in wrist flexion and elbow flexion compared to those treated with a cervical collar (data not provided). At 16 months, patients treated with surgery experienced greater improvements in wrist extension, elbow extension, shoulder abduction, and shoulder internal rotation compared with patients treated with physiotherapy. There were no differences in strength improvement between surgery and collar treatment or between physiotherapy and collar treatment at 16 months (data not provided).

At 14 to 16 weeks posttreatment, there was no difference in the likelihood of improvement in paresthesias with surgery compared with physiotherapy or collar treatment (N=81, 52% vs. 45% vs. 37%, p>0.05) but a large increase in the likelihood of improvement in sensory loss with surgery compared with either treatment (41% vs. 15%, RR 2.75, 95% CI 1.0 to 7.5, both comparisons with surgery).26 At 16 months, there remained no difference between treatment in the likelihood of improvement in paresthesias between surgery, physiotherapy, and treatment with a collar (N=81, 58% vs. 67% vs. 66%, p>0.05). There was also no difference between treatments in the likelihood of improvement in sensory loss at 16 months (N=81, 27% vs. 14% vs. 15%, p>0.05).

3.4.3.3.2. General Function

There was inadequate evidence from one small RCT on the comparative effectiveness of ACDF, physiotherapy and treatment with a cervical collar on general function in patients with cervico-brachila pain without spinal cord compression (SOE: Insufficient).

The ability to complete basic activities of daily life (e.g., dressing, prolonged sitting) to more rigorous physical activity (e.g., running, heavy work) was assessed using the disability rating index (DRI).25 Overall mean score on the DRI ranges from 0 to 100, with ability on each of 12 activities rated using a 0-100 VAS scale indicating “without difficulty” to “not at all.” There was no difference between treatment with surgery versus physiotherapy at 14-16 weeks on improvement in disability, however treatment with surgery resulted in improved dressing and heavy work compared with treatment with a collar, while treatment with physiotherapy was associated with greater ability to walk, sit for a long time, and complete heavy work compared with collar treatment (p<0.05, data not provided). At 16 months the ability to do heavy work was greater with surgery compared to the other treatments (p<0.05, data not provided). No other differences on the DRI were noted.

Although findings from this small study tended to favor surgery, especially in the short term, these findings should be interpreted with caution due to patients receiving additional treatments beyond the randomized treatment and the heterogeneity of treatment (especially physiotherapy). After 16 weeks, 8/27 surgery patients (30%) underwent a second surgery. Additionally, one patient treated with physiotherapy (4%) and five treated with collar (19%) underwent surgery. Forty-one percent of surgery patients (11/27) received physiotherapy as did 44% (12/27) of patients treated with a collar. Additionally, the use of specific physiotherapy modalities (e.g., traction, exercises, cryotherapy) varied and was at the discretion of the local physiotherapist.

3.4.3.4. Quality of Life

This study did not report quality of life outcomes.

3.4.3.5. Harms

This study did not report harms or adverse events.

3.5. Key Question 4:. In patients with cervical degenerative disease, what are the comparative effectiveness and harms of therapies added on to surgery (pre- or post-operative) compared with the same surgery alone?

3.5.1. Key Findings

  • Laminoplasty
    • There was low strength evidence of no difference in pain and function between use of a post-operative collar plus laminoplasty versus laminoplasty alone (SOE: Low).
    • There was inadequate evidence to determine the effects on pain with laminoplasty plus exercise versus laminoplasty alone (SOE: Insufficient).
  • ACDF
    • There was low-strength evidence that use of post-operative pulsed electro-magnetic field (PEMF) stimulation after ACDF was associated with increased fusion versus treatment with ACDF alone (SOE: Low); pain and function were similar with or without PEMF after ACDF (SOE: Low).
    • There was inadequate evidence to determine the effects on fusion, pain, and function of ACDF plus post-operative collar compared with ACDF alone (SOE: Insufficient).

3.5.2. Description of Included Studies

Five RCTs (N=546)2731 compared surgery plus post-operative therapy to surgery alone (Appendix C). The average mean followup duration was 12 months (range 1 week to 2 years). Two trials were conducted in Japan,30,31 and one trial each in the United States,29 Sweden,27 and China.28

The average study mean age of participants was 59 years (range 47 to 73 years); the average proportion of females in studies was 38 percent (range 29% to 47%). Two trials reported race, one enrolling a majority of White participants (93%)29 and the other enrolling Chinese participants.28 Studies enrolled patients with clinical and/or radiological evidence of cervical myelopathy28,30,31 or radiculopathy.27,29 Patients had 1-2 level disease in 1 trial (N=33),27 1-4 levels (60% had 2 levels) in 1 trial (N=323),29 and a mean of 4.5 levels in 1 trial (N=90).30 Two trials did not report number of disease levels.28,31

One trial was rated low risk of bias,28,29 and the remainder were rated moderate risk of bias (Appendix D). Methodological limitations included unclear blinding of providers or assessors and high loss to followup. Evidence for pain and function with laminoplasty plus exercise versus laminoplasty alone and evidence for fusion, pain and function for ACDF plus post-operative collar versus ACDF alone were rated insufficient due to limited evidence from one small trial each (Appendix G).

3.5.3. Detailed Analysis

3.5.3.1. Laminoplasty Plus Nonoperative Therapy Versus Laminoplasty

Three RCTs (N=190) assessed laminoplasty plus post-operative Philadelphia collars28,30 or exercise therapy incorporating 3 months of daily strengthening and range of motion exercises.31

3.5.3.1.1. Fusion

No study reported fusion outcomes.

3.5.3.1.2. Pain

There was no difference in pain between the use of a post-operative collar plus laminoplasty versus laminoplasty alone (SOE: Low). There was inadequate evidence to determine the effects on pain with laminoplasty plus exercise versus laminoplasty alone (SOE: Insufficient).

Single-door laminoplasty plus rigid Philadelphia collar worn for 3 weeks post-operatively was associated with less improvement in mean VAS scores (0-10 scale) than laminoplasty alone at weeks 1 (0.8 vs. 3.8, p=0.023) and 2 (−0.9 vs. 1.8, p=0.046) in one trial rated low risk of bias (N=35), with no difference at other timepoints (3 weeks: −1.2 vs. 1.1, p=0.148) or at other followup times (6 weeks and 3, 6, and 12 months).28 One trial (N=90) compared modified double-door laminoplasty plus Philadelphia collar worn for 2 weeks post-operatively and found no differences in change in VAS (0-10 scale) at 12 months (−0.19 vs. −0.04, p>0.05) or throughout the study period (p=0.487).30

One RCT (N=65) found no difference in mean VAS scores (0-100 scale) for neck pain and stiffness at 2 weeks and 3 months postoperative between muscle-preserving laminoplasty with exercises versus laminoplasty alone (3 months: −1.8 vs. −2.5, p=0.623).31

3.5.3.1.3. Function
3.5.3.1.3.1. Neurologic Function

There was no difference in neurologic function between the use of a post-operative collar plus laminoplasty versus laminoplasty alone (SOE: Low).

One trial of open-door laminoplasty (N=35) found no difference on mJOA scores between 3 weeks of post-operative collar versus no collar at 6 weeks (mJOA: 13.8 vs. 13.3, p=0.613)28 or longer followup. This was consistent with 12-month results from the second collar trial (N=90) which reported no difference in end-of-study mJOA scores between 2 weeks of post-operative collar use and no collar (11.1 vs. 11.8, p=0.22).30

3.5.3.1.3.2. General Function

There was no difference in general function between the use of a post-operative collar plus laminoplasty versus laminoplasty alone (SOE: Low). Two trials (N=125) of laminoplasty with or without the addition of a postoperative Philadelphia collar for 2 or 3 weeks were consistent in finding no difference in 36-Item Short Form Health Survey (SF-36) PCS and MCS scores with collar use compared to no collar. One RCT (N=35) of single-door laminoplasty found no differences in SF-36 scores between the use of a post-operative collar for 3 weeks versus no collar at 6 weeks after surgery when controlling for baseline scores (PCS: 6.4 vs. 2.8; MCS: 4.1 vs. 0, p>0.05) or at longer followup times (3, 6, 12, 24 months).28 One RCT (N=90) of double-door laminoplasty plus 2 weeks of postoperative collar use versus no collar also found no difference at 12 months in change in SF-36 PCS or MCS scores (PCS: 1.5 vs. 1.4, p>0.05; MCS: 0.1 vs. 0.4, p>0.05).30 The trial of open-door laminoplasty also found no difference on Neck Disability Index (NDI) between 3 weeks of post-operative collar and no collar at 6 weeks (NDI: 24.8 vs. 34.0, p=0.147) or at longer followup.28

3.5.3.1.4. Quality of Life

No study reported quality of life outcomes.

3.5.3.2. ACDF Plus Nonoperative Therapy Versus ACDF

One trial (N=33) assessed ACDF versus ACDF plus rigid Philadelphia collar worn for 6 weeks postoperative27 and one trial (N=323) compared ACDF with ACDF plus PEMF, delivered using a Cervical-Stim device for 4 hours daily from 1 week to 3 months postoperatively in a trial of active smokers (all patients wore a cervical collar for 1 week postoperatively).29

3.5.3.2.1. Fusion

There was inadequate evidence to determine the effects on fusion between ACDF with or without collar use (SOE: Insufficient). Use of post-operative PEMF stimulation after ACDF was associated with increased fusion versus treatment with ACDF alone (SOE: Low).

All ACDF patients in one 24-month trial (N=33) achieved radiographic fusion regardless of collar use (100% vs. 100%).27 Surgical details were not provided.

PEMF was associated with small increase in fusion rates at 6 months in one trial (N=323) based on a per protocol analysis versus ACDF with no PEMF (N=240; 83.6% vs. 68.6%, p=0.0065); fusion rates were also improved in intent-to-treat analyses assuming missing patients fused (N=323; 85.9% vs. 76.3%, p=0.0269) or imputing patient status at last visit (N=281; 78.2% vs. 64.8%, p=0.0127), but not when assuming missing patients did not fuse (65.6% vs. 56.3%, p=0.0835).29 However, there was no difference in fusion rates in the per protocol analysis at 12 months.29 This study used a Smith-Robinson technique with allograft and cervical plate system.

3.5.3.2.2. Pain

The ACDF trial of PEMF versus no PEMF found similar VAS scores for shoulder/arm pain at rest or with activity at 6 and 12 months postoperative (date provided in graph form)29 (SOE: Low).

3.5.3.2.3. Function
3.5.3.2.3.1. General Function

There was inadequate evidence to determine the effect on general function of ACDF plus post-operative collar compared with ACDF alone for all time points (SOE: Insufficient).

Collar use was associated with greater improvement in SF-36 PCS scores from baseline than ACDF without a collar at 6 weeks (mean difference [MD] 5.8; 95% CI 0.8 to 10.7), 3 months (MD 6.8; 95% CI 0.4 to 13.1), 6 months (MD 7.4; 95% CI 1.4 to 13.4), and 12 months (MD 7.5; 95% 0.3 to 14.6), but not at 24 months (MD 4.9; 95% CI −0.8 to 10.5; p=0.088).27 In the same trial, there was no difference in mean change in SF-36 MCS scores at 6 weeks (MD −1.9; 95% CI −11.1 to 7.4) or at longer postoperative followup times.27

Six-weeks’ collar use was associated with greater improvement in NDI scores from baseline than no collar at 6 weeks (MD −4.4; 95% CI −8.6 to −0.2), but not at 3 months (MD −2.1, 95% CI −8.0 to 3.8) or at other timepoints.27 There was no difference in NDI scores between daily PEMF and no stimulation at 6 months (31.0 vs. 23.0, p>0.05) or 12 months postoperative (25.6 vs. 22.8, p>0.05).29

3.5.3.2.4. Quality of Life

No study reported quality of life outcomes.

3.6. Key Question 5: In patients with cervical radiculopathy due to cervical degenerative disease, what are the comparative effectiveness and harms of posterior versus anterior surgery?

3.6.1. Key Findings

  • There was low-strength evidence of no differences in neck and arm pain between anterior versus posterior approaches short term (3, 6 months) and intermediate term (12, 24 months) (SOE: Low).
  • There was inadequate evidence to determine benefits of anterior versus posterior approaches for neck pain (immediately postoperative), fusion, or neurologic function (SOE: Insufficient).
  • There was low-strength evidence of no difference between approaches on measures of general function or quality of life (SOE: Low).
  • There was low-strength evidence of no difference between approaches in the likelihood of reoperation (SOE: Low).
  • Neurologic deficits were reported inconsistently and various measures were used across studies, however there was low-strength evidence of no differences between approaches were reported (SOE: Low).
  • One nonrandomized study reported higher 30-day mortality with ACDF versus posterior cervical foraminotomy (PCF), but there were very few deaths (SOE: Insufficient).
  • No serious adverse events with either approach were reported in three RCTs; evidence on specific adverse events was limited; one RCT reported no difference in approaches for surgery-related adverse events (SOE: Insufficient).

3.6.2. Description of Included Studies

Four RCTs (N=277)3235 compared anterior versus posterior approaches (Appendix C). The average mean followup duration was 27 months (range 12 to 60 months). One trial was conducted in the United States,34 one in Germany,33 one in Egypt,32 and one in the Netherlands.35 All four trials were conducted at single sites. The average study mean age of participants for the trials was 45 years (range 43 to 51 years); the average proportion of females in trials was 55 percent (range 50% to 66%). No trials reported race. All four trials limited enrollment to patients with radiculopathy; two trials excluded patients with myelopathy,32,34 and the other two did not report myelopathy.33,35 Patients in all four trials had single-level disease. Two trials were rated moderate risk of bias34,35 and two trials were rated high risk of bias (Appendix D).32,33 One trial stated that no funding was received,33 one trial reported government funding,35 and two trials did not address funding.32,34 Primary methodologic concerns were unclear randomization and treatment allocation concealment, dissimilarity between treatment groups at baseline and lack of assessor blinding.

Four retrospective NRSIs (N=47,684), including one database study, compared anterior versus posterior procedures (Appendix C).3639 Three NRSIs were conducted in the United States36,37,39 and one in the United Kingdom38 Three studies3638 drew patients from a single site and one39 used an insurance administrative database (N=46,598). The average study mean age of participants was 50 years (range 48 to 53 years); the average proportion of females in studies was 44 percent (range 31% to 54%). One study reported race, enrolling a majority of White participants (88%).37 All four NRSIs limited enrollment to patients with radiculopathy. Patients had single-level disease in three NRSIs.36,38,39 A mean of 2.6 surgical levels was reported in one study.37 Funding was not reported in two NRSIs,36,38 one was government funded39 and one stated that no funding was received.37 Three NRSIs were rated moderate risk of bias3739 and one was rated high risk of bias (Appendix D).36 Common methodologic limitations were unclear loss to followup and lack of clarity regarding assessor blinding. Additionally, lack of clarity regarding patient enrollment and comparability of treatment groups at baseline combined with inadequate adjustment for confounding for prognostic variables were concerns resulting in the NRSI being rated high risk of bias.

For many outcomes, authors did not provide adequate data to calculate effect sizes and confidence intervals. Although NRSI may have adjusted for some outcomes, authors did not always provide adjusted estimates for our outcomes of interest. Given the potential for differences in patient characteristics between anterior and posterior procedures in NRSIs, results from these studies should be interpreted cautiously.

Evidence was insufficient for fusion, neurologic function, general function, quality of life, mortality and serious adverse events, based on a combination of two or more of the following: high risk of bias, inconsistent findings, and lack of precision (Appendix G).

3.6.3. Detailed Analysis

3.6.3.1. Anterior Versus Posterior

The anterior approach used was anterior cervical foraminotomy (ACF) in one RCT,32 anterior cervical decompression without fusion (ACD) in one RCT,34 and anterior cervical decompression and fusion (ACDF) in three RCTs3335 and all four NRSIs.3639 All studies used posterior cervical foraminotomy as the comparator.

3.6.3.1.1. Fusion

There was inadequate evidence to determine benefits and harms of anterior versus posterior surgical approaches on cervical fusion (SOE: Insufficient).

One RCT (N= 30) rated high risk of bias reported that no participants in either the ACF group or the posterior cervical foraminotomy group had radiologic evidence of instability on cervical x-rays at time of discharge or at a mean of 14 months.32 Authors did not define stability or criteria for determining fusion.

3.6.3.1.2. Pain

There were no differences in neck and arm pain between anterior versus posterior approaches in the short (3, 6 months) and intermediate term (12, 24 months) (SOE: Low); there was inadequate evidence to determine the benefits and harms of anterior versus posterior approaches on neck pain immediately post-operative (SOE: Insufficient).

At 12 months the proportion of ACDF vs. PCF patients experiencing a 26-point improvement (0-100 scale) in VAS neck pain (62% vs. 52%) or 41-point improvement in VAS neck pain (60% vs. 54%) was reported as comparable in one RCT (N=243);35 an RR could not be calculated.

One small trial (N=30) rated high risk of bias reported that ACF was associated with lower neck pain VAS scores (0-10 scale) within a week of discharge (p<0.001), however the reported confidence interval for the difference between groups suggested no difference (MD −3.13, 95% CI −4.52 to 1.75) and is likely a typographical error and should be (MD −3.13, 95% CI −4.52 to −1.75).32 One RCT (N=175) also rated high risk of bias, compared ACDF versus PCF at 3, 6, 12, and 24 months for arm pain VAS (0-100 scale), neck pain VAS (0-100 scale) and North American Spine Society (NASS) pain (0-6 scale).33 The mean differences across measures did not change with time and there were no differences between ACDF and PCF in arm pain VAS (range from −1 to 1), neck pain VAS scores (range from 1 to 4) or NASS pain scores (range from −0.1 to 0.1) at any timepoint. Statistical tests were not reported and reported data were inadequate to calculate confidence intervals for effect sizes, but the authors noted that the clinical results were the same in both groups. The largest RCT (n=243, moderate quality) found no difference in VAS neck pain scores at 12 months (MD −2.70, 95% CI −9.67 to 4.27) or VAS arm pain (MD – 2.80, 95% CI, −8.85 to 3.25).35 Pooled estimates across the RCTs also reveal no difference in VAS arm pain at 12 months between ACDF and PCF (2 RCTs, N=403, MD −1.36, 95% CI −5.23 to 1.86, I2= 0%; Appendix F Figure F-1).33,35 Across the same two RCTs, there was again no difference between ACDF and PCF in VAS neck pain (MD 0.31, 95% CI −6.20 to 5.81, I2=10.6%; Appendix F Figure F-2).33,35

The fourth RCT (N=72) rated moderate risk of bias, reported similar rates of patient-reported complete or partial pain improvement (unvalidated measure) for anterior approaches (ACD and ACDF) versus PCF at day 1 postoperatively (100% vs. 100%, RR 1.00), at 2 months (98% vs. 100%, RR 0.98, 95% CI 0.94 to 1.02, p=0.32), and at approximately 60 months postoperatively (96.5% vs. 100%, RR 0.96, 95% CI 0.90 to 1.03, p=0.32).34

Findings for pain from two NRSIs were consistent with those of the RCTs. The larger study (N=688) found no difference in mean scores for VAS arm pain (0-10 scale) at 3 months (4.20 vs. 3.82, MD 0.38, p>0.05), 12 months (4.06 vs. 4.07, MD 0.01, p>0.05) or 24 months (3.85 vs. 4.48, MD −0.63, p>0.05).38 In the smaller NRSI (N=70) rated high risk of bias, there were no differences between ACDF versus PCF in VAS score (0-10 scale, not specified for arm or neck pain, 2.6 vs. 3.0, MD −0.4, p =0.04) at 12 months.36 Reported estimates appear to be unadjusted.

3.6.3.1.3. Function
3.6.3.1.3.1. Neurologic Function

There was inadequate evidence to determine benefits and harms of anterior versus posterior approaches on neurologic function for all time points (SOE: Insufficient).

One RCT (N=175) rated high risk of bias33 reported similar mean NASS neurology scores (0-6 scale) for ACDF and PCF and that no patient had deterioration of symptoms. Means were consistent at 3, 6, 12, and 24 months (range MD −0.2 to 0.2). Statistical tests were not reported and data were inadequate to calculate confidence intervals, but the authors noted that the clinical results were the same in both groups.

3.6.3.1.3.2. General Function

There was no difference in general function between anterior and posterior procedures based on NDI or Odom’s criteria at 12 months in RCTs. (SOE: Low)

One moderate-quality RCT (N=243) reported that ACDF and PCF the proportion of reponders was comparable based on NDI (defined as ≥17.3% improvement, 0-100 scale; 63% vs. 66%; data were insufficient to calculate RR).35 There was also no difference in mean change scores on NDI at 12 months (MD −1.2 , 95% CI −5.8 to 3.5).

There was no difference in function between ACF and PCF at 12 months across two RCTs (N=273)32,35 based on Odom’s criteria rating of excellent or good (2 RCTS, N= 273, 68.3% vs. 74.6%, RR 0.95, 95% CI 0.81 to 1.12, I2= 0%) (Appendix F Figure F-3). In the larger trial analysis with complete cases (N=204) at 1 year suggested that slightly fewer ACDF patients had excellent or good function, but the effect size is below the threshold for a small effect (76% vs. 88%, RR 0.87, 95% CI 0.76 to 0.99).35

One NRSI (N=688), reported no difference between ACD and ACDF on the Core Outcome Measures Index-neck (COMI-neck, 0-10 scale), which has items for pain, function, symptom-specific well-being, quality of life and disability.38 Mean changes in COMI-neck scores (0-10 scale) were similar at 3 months (−2.38 vs. −2.31, p=0.88) and 6 months (−2.94 vs. −2.67, p=0.55); at 24 months the mean COMI-neck scores were also similar (4.16 vs. 4.72, p>0.05; mean change not reported). The proportion of patients who achieved minimum clinically important difference on the COMI-neck score (decrease ≥2 points) was also similar at 3 months (50% vs. 56%, RR 0.89, 95% CI 0.65 to 1.24), 12 months (59% vs. 58%, RR 1.02, 95% CI 0.76 to 1.36), and 24 months (57% vs. 50%, RR 1.14, 95% CI 0.71 to 1.83). One NRSI (N=70) rated high risk of bias found no difference between ACDF versus PCF in Pain Disability Questionnaire functional status subscale scores (0 to 90 scale, 31.3 vs. 43.2, MD −11.9, p=0.30) or Pain Disability Questionnaire total score (52.8 vs. 69.6, p=0.50).36 One RCT (N=175) rated high risk of bias reported Hilibrand criteria ratings (Poor, Satisfactory, Good, Excellent, measure not validated) for ACDF versus PCF at 3, 6, 12 and 24 months.33 Data were not available to calculate effect sizes, but the authors noted that the clinical results were the same in both groups at all timepoints: Excellent (84% vs. 83% at 3 months, and 76% vs. 79% at 24 months).

3.6.3.1.4. Quality of Life

There was no difference in EuroQOL-5 Dimensions (EQ-5D, scale 0-1) at 12 months between ACDF and PCF in one RCT (SOE: Low).35

The RCT (N=243) found no difference on EQ-5D between ACDF and PCF in either the proportion of patient meeting a clinically important difference of 0.24 improvement (38% vs. 38%) or in change scores at 12 months (MD −0.01, 95% CI −0.06 to 0.10).35 Similarly, one NRSI (N=70) rated high risk of bias found no difference in EuroQOL-5 Dimensions (EQ-5D, scale 0-1) at 12 months for ACDF (MD 0.69, 95% CI 0.61 to 0.77) versus PCF (MD 0.72, 95% CI 0.64 to 0.80, p=0.60).36

3.6.3.1.5. Reoperation

There was no difference in the likelihood of reoperation between anterior and posterior procedures across four RCTs3235 (2 of which were rated high risk of bias) or in one retrospective NRSI (N=328)37 (Figure 3) (SOE: Low). Exclusion of the high risk of bias RCTs did not substantially change the estimate (2 RCTs, RR 0.70, 95% CI 0.30 to 1.61, I2= 0%).34,35

Figure 3 is a forest plot. Risk ratios were reported or calculated for four randomized controlled trials comparing anterior cervical foraminotomy, anterior cervical decompression without fusion, or anterior cervical discectomy and fusion with posterior cervical foraminotomy, with a pooled risk ratio of 0.71 (95% confidence interval 0.39 to 1.32) and an I-squared value of 0%. A risk ratio was reported or calculated for one retrospective nonrandomized study of interventions comparing anterior cervical discectomy and fusion with posterior cervical foraminotomy, with a risk ratio of 0.74 (95% confidence interval 0.30 to 1.83) and I-squared value of 0%.

Figure 3

Reoperation: anterior versus posterior cervical foraminotomy. ACF = anterior cervical foraminotomy, ACD = anterior cervical decompression without fusion; ACDF = anterior cervical discectomy and fusion; CI = confidence interval; PL = profile likelihood. (more...)

3.6.3.1.6. Harms

There were no differences in neurologic deficits between anterior and posterior approaches, although results were reported inconsistently (SOE: Low); reporting of other adverse events was limited (SOE: Insufficient).

Description and reporting of serious adverse events was limited. One RCT (N=243) reported similar rates of surgery-related adverse events for ACDF and PCF 6% in both groups).35 Serious (not specified as surgery related) included: post-operative events (anaphylactic reaction to antibiotics, n=1, wound hematoma not requiring surgery, n=1, pulmonary embolism, n=1) and events requiring hospitalization (wound problems 0.8% vs. 1.7%, cardio-thoracic problems 0.08% vs. 2.5%). Slight cage subsidence was reported in one ACDF patient but there were no complaints and no reoperation was required.

Three RCTS (2 of which were rated high risk of bias) reported that no serious adverse events occurred for any patients.3234 One RCT (N=72) that compared ACD and ACDF to PCF reported zero deaths.34 One propensity score matched NRSI (N=46,598) reported higher 30-day mortality with ACDF versus PCF (MD 1 event per 10,000 cases, 95% CI 0.0 to 2.0 per 10,000 cases, p=0.012).39 Although the MD is significant, it is small, suggesting the possibility of 0 to 2 deaths with PCF. Given that administrative data are subject to misclassification and potential for inadequate adjustment for confounders, this finding should be interpreted cautiously.

Neurologic deficits were reported inconsistently across studies. In one RCT (N=243) there was no difference in new radicular symptoms between ACDF and PCF recipients (3.2% vs. 0.8%, RR 3.84, 95% CI 0.43 to 33.85) as were persistent radicular symptoms (1.6% vs. 6.7%, RR 0.24, 95% CI 0.05 to 1.11); estimates are imprecise.35 One RCT (N=72) found no difference in anterior versus posterior approaches for new weakness (8% vs. 14%, RR 0.59, 95% CI 0.14 to 2.40, p=0.46) or new numbness (6% vs. 9%, RR 0.66, 95% CI 0.12 to 3.68, p=0.63).34 The other two RCTs reported specific neurologic deficits: in one small trial (N=30) no patients in either group developed Horner’s syndrome;32 the other trial (N=175) reported that no patients experienced damage to myelin resulting in paralysis of any degree.33 One NRSI (N=70) reported that one patient who underwent PCF experienced C6 nerve injury, but did not provide data for patients who underwent ACDF.36 Central nervous system complications at 30 days postoperatively was similar between anterior and posterior procedures in a large NRSI (N=46,598, MD 4 per 10,000, 95% CI −14 to 22 per 10,000).39

Dysphagia was reported inconsistently across studies. One RCT (N=243), reported one case of unresolved dysphagia at 12 months in the ACDF group.35 One RCT (N=175) reported transient difficulty swallowing for three patients who underwent ACDF and no patients who underwent PCF.33 In a propensity score matched NRSI (N=46,598), ACDF was associated with higher rates of dysphagia/dysphonia at 30 days versus PCF (MD 14.5 per 1,000 cases, 95% CI 12.6 to 16.4 per 1000, p<0.001).39 Neither study provided information on severity of dysphagia or need for intervention.

One large NRSI (N=46,598) reported that the following were rare but more common with ACDF versus PCF within 30 days after surgery: vascular injury (MD 2 per 10,000 cases, 95% CI 1 to 3 per 10,000 cases, p=0.001), cerebrospinal fluid leak (MD 2 per 10,000 cases, 95% CI 1 to 3 per 10,000 patients, p=0.002), and deep venous thrombus (9 per 10,000 cases, 95% CI 2 to 16 per 10,000 patients, p=0.01). There were no differences between anterior and posterior approaches for pulmonary embolism (MD 2 per 10,000, 95% CI −9 to 12 per 10,000 cases).39

3.7. Key Question 6: In patients with cervical degenerative disease, what are the comparative effectiveness and harms of posterior versus anterior surgery in patients with greater than or equal to three level disease?

3.7.1. Key Findings

  • There was low-strength evidence of no difference in neck pain, neurologic function and general function intermediate term (12 to 15 months) for ACDF versus posterior cervical decompression and fusion (PCDF) or laminoplasty for three or more levels (SOE: Low).
  • The evidence for fusion, neck pain (short term), arm pain, neurologic function (short term) and quality of life was inadequate to draw conclusions (SOE: Insufficient).
  • There was inadequate evidence to draw conclusions on reoperation rates between ACDF and posterior procedures (SOE: Insufficient).
  • There was low-strength evidence that mortality and severe dysphagia did not differ between ACDF and laminoplasty or PCDF (SOE: Low).
  • Rates of new neurologic complications and serious adverse events were inconsistently reported across studies and rare in general; there was low-strength evidence that posterior approaches were more commonly associated with a moderate to large increase in the odds of experiencing a neurologic adverse event and serious adverse event compared with ACDF (SOE: Low).

3.7.2. Description of Included Studies

One RCT40 and nine NRSIs4149 compared anterior (i.e., ACDF) versus posterior surgery (i.e., laminoplasty, PCDF) at three or more levels for treatment of CDD (Appendixes CD).

The RCT (N=34)40 compared ACDF with posterior laminoplasty for participants with cervical spondylotic myelopathy (CSM) (71%) or ossification of the posterior longitudinal ligament (OPLL) (29%) involving three (71%) or four (29%) levels. Fewer participants randomized to ACDF were diagnosed with OPLL (24% vs. 35%), had four-level disease (18% vs. 41%) or were smokers (12% vs. 41%). Mean participant age was 62 years and 26 percent were female.40 Race/ethnicity was not reported. Average followup time was 41 months. This trial was conducted in China and was rated high risk of bias.

Across the nine NRSIs, one prospective44 and eight retrospective,4143,4549 sample sizes ranged from 245 to 13,884 (total N=41,982). The average study patient age was 61 years (range 54 to 63 years) and 43 percent were female (range 31% to 52%). Three studies reported race/ethnicity (White: range 65.5% to 82.3%; Black: 12.3% to 17.0%; Hispanic: 0.5%; Other: 17.7% to 19.1%).41,47,48 The anterior approach was ACDF (with or without corpectomy) in all nine studies4149 and also included anterior cervical corpectomy and fusion in one study.43 The posterior approach was PCDF in six studies,41,42,4446,48 laminectomy and fusion in two studies43,47 and laminoplasty in two studies.47,49 Two studies included three treatment groups; one with two anterior arms43 and one with two posterior arms.47 The number of involved levels varied across the studies but most included three to five levels; one study included only three levels48 and another only four levels.45 One NRSI was rated low risk of bias46 and the remainder were rated moderate risk of bias.4145,4749 Given the potential for confounding by indication and differences in patient population between those receiving posterior versus anterior procedure, particularly in the NRSI, results should be interpreted cautiously.

Evidence was insufficient for fusion, pain (short and long term), neurologic function (short term), quality of life, and reoperation based on a combination of two or more of the following: high risk of bias, inconsistent findings, and lack of precision (Appendix G).

3.7.3. Detailed Analysis

3.7.3.1. Fusion

There was inadequate evidence to determine the benefits and harms of anterior versus posterior surgical approaches on fusion in participants with three or more level disease (SOE: Insufficient).

One retrospective NRSI that used propensity score matching (N=12,248) found that PCDF was associated with substantially higher odds of pseudarthrosis at 12 months compared with ACDF (odds ratio [OR] 2.43, 95% CI 1.96 to 3.01) at three levels.48 The RCT did not report fusion.

3.7.3.2. Pain

There was low-strength evidence of no difference in neck pain in the intermediate term (SOE: Low); there was inadequate evidence for neck pain in the short term and arm pain in the intermediate term in participants with three or more level disease (SOE: Insufficient).

One RCT (N=32) rated high risk of bias reported no differences between 3- or 4-level ACDF and laminoplasty in neck pain scores (VAS, 0-10 scale) at 3 months (MD −0.10, 95% CI −0.46 to 0.26) and 6 months (MD 0, 95% CI −0.18 to 0.18) or at 12 months (MD 0.10, 95% CI −0.23 to 0.43) and 15 months (MD −0.10, 95% CI −0.44 to 0.24).40 Similarly, there were no differences between ACDF (with and without corpectomy) and PCDF at three to five levels for NRS (0-10) neck pain scores (median 2 vs. 2, adjusted OR 0.67, 95% CI 0.37 to 1.21) or arm pain scores (median 1 vs. 0.5, adjusted OR 0.99, 95% CI 0.51 to 1.93) at 12 months in one retrospective NRSI (N=245).41

3.7.3.3. Function

3.7.3.3.1. Neurologic Function

There was low-strength evidence of no difference in neurologic function between anterior and posterior approaches in participants with three or more level disease in the intermediate term (SOE: Low); there was inadequate evidence for determining the benefits and harms on neurologic function in the short term (SOE: Insufficient).

There was no difference in neurologic function at intermediate term (12 months) in one small RCT rated high risk of bias (N=32, MD 0.28, 95% CI −0.41 to 0.98, Japanese Orthopaedic Association Scale [JOA] scores, 0-18 scale)40 and two NRSIs rated moderate risk of bias (N=506, MD 0.15, 95% CI −0.29 to 0.60, I2=74.0%, mJOA scores, 0-18 scale)40,41,44 that compared ACDF with posterior laminoplasty (RCT) or PCDF (NRSIs) for 3- to 5-level disease (Figure 4) (SOE: Low). There was also no difference between groups in JOA scores short term in the RCT (N=32): 3 months (MD −0.40, 95% CI −1.76 to 0.96) and 6 months (MD 0.20, 95% CI −1.14 to 1.54).40 The pooled estimate across the two NRSIs had substantial heterogeneity (Figure 4), which may be due in part to different study designs, variables controlled for in multivariate analyses, and types of posterior procedures used. The prospective NRSI44 showed no difference between groups and included patients who underwent laminoplasty (14%) (all others had PCDF); it was unclear which baseline confounders were controlled for in this study. The retrospective NRSI41 showed a large improvement with ACDF versus PCDF approaches; multivariate logistic regression models controlled for 19 different baseline variables.

Figure 4 is a forest plot. Mean differences were reported or calculated for two nonrandomized studies of interventions comparing anterior cervical discectomy and fusion versus posterior decompression and fusion at 12 months in patients with cervical spondylotic myelopathy, with a pooled mean difference of 0.15 (95% confidence interval −0.29 to 0.60) and an I-squared value of 74%. A mean difference was reported or calculated for one randomized controlled trial comparing anterior cervical discectomy and fusion versus laminoplasty at 12 months in patients with cervical spondylotic myelopathy or ossification of the posterior longitudinal ligament, with a mean difference of 0.28 (95% confidence interval −0.41 to 0.98).

Figure 4

Neurologic function (JOA or mJOA scores): anterior versus posterior approaches for ≥3 levels. ACDF = anterior cervical discectomy and fusion; CI = confidence interval; CSM = cervical spondylotic myelopathy; JOA = Japanese Orthopaedic Association; (more...)

One prospective NRSI (N=264) assessed neurologic function with the Nurick score (0-5 scale) and found no difference between 3- to 5-level ACDF and posterior approaches (laminectomy and fusion [86%] or laminoplasty [14%]) in mean change from baseline to 12 months after adjusting for baseline characteristics (MD in change scores 0.19, 95% CI −0.20 to 0.5844).

3.7.3.3.2. General Function

There were no differences between anterior and posterior surgery for 3- to 5-level disease at intermediate term (12 months) for any function measure reported across two NRSIs (N=509)41,44 (SOE: Low). One prospective NRSI (N=264) compared ACDF with laminectomy and fusion (86%) or laminoplasty (14%) and reported the change in NDI scores compared with baseline (MD in change scores −0.97, 95% CI −7.15 to 5.21, scale unclear), SF-36 PCS scores (MD in change scores −1.90, 95% CI −5.30 to 1.50, 0-100 scale) and SF-36 MCS scores (MD in change scores 0.42, 95% CI −2.30 to 3.14, 0-100 scale).44 One retrospective NRSI (N=245) compared ACDF (with and without corpectomy) with PCDF and reported median NDI scores (16 vs. 17, adjusted OR 0.76, 95% CI 0.42 to 1.37)41 (SOE: Low).

3.7.3.4. Quality of Life

There was inadequate evidence to determine the benefits and harms of anterior versus posterior approaches on quality of life in participants with three or more level disease (SOE: Insufficient).

One retrospective cohort study (N=245) found no difference between 3- to 5-level ACDF (with and without corpectomy) and PCDF in EQ-5D scores intermediate term at 12 months (adjusted odds ratio 1.36, 95% CI 0.76 to 2.44, referent = ACDF) after adjusting for a number of baseline variables.41

3.7.3.5. Reoperation

There was inadequate evidence to draw conclusion on reoperation rates between ACDF and posterior procedures (SOE: Insufficient).

Seven NRSIs (N=27,579) that compared ACDF with posterior procedures at three or more levels reported reoperation/revision rates.41,43,4549 In pooled analysis at any timepoint based on longest followup (range 1 to 60 months), there were no differences between ACDF versus laminoplasty (2 NRSIs, N=3,406, 5.4% vs. 6.2%, RR 0.87, 95% CI 0.59 to 1.79, I2=0%)47,49 or PCDF (6 NRSIs, N=24,355, 10.1% vs. 11.8%, RR 0.79, 95% CI 0.47 to 1.35, I2=96.5%);41,43,4548 however, heterogeneity was substantial for the latter comparison (Figure 5). Exclusion of one outlier study45 at 60 months that included patients with both myelopathy and radiculopathy reduced heterogeneity slightly and resulted in a moderate reduction in the likelihood of reoperation for ACDF compared with PCDF at any timepoint (1-18 months, 5 NRSIs, N=20,641, 7.4% vs. 10.4%, RR 0.59, 95% CI 0.42 to 0.95, I2=82.4%).41,43,4648 These results were driven by two large administrative database studies.43,48 There was no difference between ACDF and PCDF at 1 to 3 months (2 NRSIs, N=736, RR 0.82, 95% CI 0.32 to 2.08, I2=0%).46,47 ACDF was associated with a higher risk of reoperation compared with PCDF (N=3,714, RR 1.44, 95% CI 1.27 to 1.62) in one study at 60 months.45 It is challenging to draw firm conclusions from this data as definitions of reoperation and revision varied or were not specified across the studies, there were differences in posterior approach used, and the pooled estimates were mainly driven by two large administrative data studies.

Figure 5 is a forest plot. Risk ratios were reported or calculated for two nonrandomized studies of interventions comparing anterior cervical discectomy and fusion versus laminoplasty at 1 and 12 months in patients with cervical spondylotic myelopathy, with a pooled risk ratio of 0.87 (95% confidence interval 0.59 to 1.79) and an I-squared value of 0%. Risk ratios were reported or calculated for six nonrandomized studies of interventions comparing anterior cervical discectomy and fusion versus posterior cervical decompression and fusion over 1 to 60 months in patients with cervical spondylotic myelopathy, radiculopathy, or ossification of the posterior longitudinal ligament, with a pooled risk ratio of 0.79 (95% confidence interval 0.47 to 1.35) and an I-squared value of 96.5%. An overall risk ratio was calculated across all eight studies with a pooled risk ratio of 0.83 (95% confidence interval 0.56 to 1.28) and an overall I-squared value of 95.2%.

Figure 5

Reoperation: anterior versus posterior approaches for ≥3 levels. ACCF = anterior cervical corpectomy and fusion; ACDF = anterior cervical discectomy and fusion; CI = confidence interval; CSM = cervical spondylotic myelopathy; OPLL = ossification (more...)

One large NRSI (N=12,248) that used administrative data and propensity score matching reported reoperation outcomes that could not be included in the meta-analysis.48 PCDF was associated with substantially higher odds of wound-specific revision surgery at 1 month (1.2% vs. 0.4%, OR 3.02, 95% CI 2.56 to 3.49) and moderately lower odds of additional anterior or posterior fusion at 12 months (4.3% vs. 7.0%, OR 0.60, 95% CI 0.44 to 0.76) compared with ACDF at three levels.

3.7.3.6. Harms

3.7.3.6.1. Neurologic Deficits

There was low-strength evidence that posterior approaches were more likely associated with a moderate to large increase in the odds of experiencing a neurologic adverse event compared with ACDF (SOE: Low). Reporting of neurological events varied across one RCT (N=32)40 and six NRSIs (total N=37,095, range 245 to 13,884).4144,48,49 The RCT reported no cases of postoperative worsening of myelopathy or C5 root palsy with either 3- or 4-level ACDF versus posterior laminoplasty.40 Central nervous system complications (not further defined) were rare through 90 days after ACDF (<0.7%) and posterior laminoplasty (0.9%) at three or more levels in one NRSI (N=3,042).49 Two NRSIs reported that PCDF was associated with moderately higher odds of “neurological complications” compared with ACDF at three or more levels but did not provide further details: 0.59% vs. 0.35% (adjusted OR 1.7, 95% CI 1.0 to 2.8) immediately postoperative in one study (N=13,884)42 and 1.8% vs. 1.1% (OR 1.6, 95% CI 1.08 to 2.38) at 1 month in another (N=7,412).43 Two other NRSIs reported no difference between ACDF and PCDF at three to five levels in new neurological deficits (N=264, 4.1% vs. 3.2%, RR 1.31, 95% CI 0.35 to 4.95)44 or new motor deficits (N=245, 2% vs. 0%)41 at 12 months. One large NRSI (N=12,248) reported no difference between PCDF and ACDF in the incidence of postoperative coma (0.4% vs. 0.6%, OR 1.26, 95% CI 0.75 to 1.77).48

3.7.3.6.2. Mortality

There was low-strength evidence that mortality did not differ between ACDF and laminoplasty or PCDF (SOE: Low).

Three NRSIs (total N=15,057, range 546 to 13,884) that compared anterior with posterior approaches at three or more levels found no difference in short-term mortality after ACDF versus posterior laminoplasty at 1 month (1 NRSI, N=364, 0% vs. 0.05%, RR 0.33, 95% CI 0.01 to 8.13)47 and ACDF versus PCDF at hospital discharge to 1 month (3 NRSIs, N=14,875, 0.3% vs. 0.3%, RR 0.96, 95% CI 0.25 to 1.81, I2=17.8%)42,46,47 (Figure 6). One NRSI (N=12,248) reported no deaths in either arm (ACDF vs. PCDF) and was unable to be included in the pooled analysis.48

Figure 6 is a forest plot. A risk ratio was reported or calculate for one nonrandomized study of interventions comparing anterior cervical discectomy and fusion versus laminoplasty at 1 month in patients with cervical spondylotic myelopathy, with a risk ratio of 0.33 (95% confidence interval 0.01 to 8.13) and an I-squared value of 0%. Risk ratios were reported or calculated for three nonrandomized studies of interventions comparing anterior cervical discectomy and fusion versus posterior cervical decompression and fusion postsurgery to 1 month in patients with cervical spondylotic myelopathy or ossification of the posterior longitudinal ligament, with a pooled risk ratio of 0.96 (95% confidence interval 0.25 to 1.81) and an I-squared value of 17.8%. An overall risk ratio was calculated across all four trials with a pooled risk ratio of 0.93 (95% confidence interval 0.25 to 1.68) and an I-squared value of 0%.

Figure 6

Mortality: anterior versus posterior approaches for ≥3 levels. ACDF = anterior cervical discectomy and fusion; CI = confidence interval; CSM = cervical spondylotic myelopathy; OPLL = ossification of the posterior longitudinal ligament, PCDF = (more...)

3.7.3.6.3. Dysphagia

There was low-strength evidence that the likelihood of experiencing severe dysphagia did not differ between ACDF and laminoplasty or PCDF (SOE: Low).

Severe dysphagia was rare across two NRSIs that compared ACDF with PCDF or posterior laminoplasty. There were two cases (1%) requiring a nasogastric tube in one study (N=245)41 and one case (0.5%) requiring an unplanned readmission 11 days post surgery in the other (N=364);47 all three cases occurred in the ACDF arms (SOE: Low).

One RCT (N=32)40 and seven NRSIs (total N=41,172, range 245 to 13,884)4143,45,46,48,49 also reported dysphagia but did not report the severity; frequencies ranged from 2.7 to 14.0 percent after ACDF and from 0 to 3.6 percent after PCDF across six NRSIs (N=38,130),4143,45,46,48 most of which reported a substantial to moderate decrease in the odds/risk of dysphagia with PCDF (OR range 0.20 to 0.61), and from <0.7 to 5.9 percent versus 0 to <0.7 percent in the ACDF versus laminoplasty arms, respectively, across one small RCT (N=32)40 and one large NRSI (N=3,042), with no differences between treatments.49

3.7.3.6.4. Serious Adverse Events

There was low-strength evidence that posterior approaches were more likely associated with a moderate to large increase in the odds of experiencing a serious adverse event compared with ACDF (SOE: Low).

One RCT (N=32) reported that intraoperative dural tear occurred in 5.9 percent of ACDF versus 11.8 percent of PCDF patients (RR 0.50, 95% CI 0.05 to 5.01) and that there were no cases of instrumentation failure or malposition, infection or hematoma.40

Across the NRSIs, reporting of serious adverse events varied; adverse events generally occurred more often with posterior approaches versus ACDF.

Thrombolic events were rare across eight NRSIs (total N=41,718, range 245 to 13,884) with followup immediately postoperative to 12 months.4143,4549 The frequency of deep vein thrombosis (DVT) or pulmonary embolism ranged from 0 to 2.3 percent (ACDF) versus 0 to 4.3 percent (PCDF or posterior laminoplasty). Four of the studies (N=37,258) reported that posterior approaches were associated with moderate to large increases in the odds of experiencing a thrombolic event compared with ACDF (range of ORs 1.75 to 3.7).42,43,45,48

Stroke/cerebrovascular events occurred variably across three NRSIs with short-term followup (1 to 3 months); one study (N=546) reported no events in either arm (ACDF vs. PCDF or posterior laminoplasty),47 one study (N=627) reported more events after ACDF (1.8% vs. 0% PCDF, p=0.016),46 while the third found that PCDF was associated with a large increase in the odds of stroke compared with ACDF (N=12,248, 4.2% vs. 2.5%, OR 1.68, 95% CI 1.48 to 1.89).48

Sepsis was rare across three NRSIs (total N=7,302, range 546 to 3,714).45,47,49 One study reported substantially higher odds of having sepsis within 3 months after PCDF compared with ACDF (N=3,714, 2.5% vs. 0.7%, adjusted OR 3.56, 95% CI 1.96 to 6.91)45 while the other two studies (N=3,588) reported similar rates between groups (ACDF, range <0.7% to 1.1% vs. PCDF/posterior laminoplasty, range <0.7% to 1.7%)47,49

Surgical site infection was reported by four NRSIs. Three studies (N=22,702)43,48,49 reported that posterior approaches (PCDF or laminoplasty) were associated with a large increase in the odds of surgical site infection compared with ACDF at 1 to 3 months (frequency range 2.4% to 4.7% vs. 0.8% to 1.0%, OR range 3.1 to 3.7) and the fourth (N=245) found no difference between groups (1% each).41

Wound dehiscence was infrequent across four NRSIs, two of which reported that PCDF was associated with a substantial increase in the odds of experiencing this complication compared with ACDF (N=19,660, frequency range 1.3% to 2.7% vs. 0.1% to 0.5%, range of ORs 5.6 to 10.8)43,48 and two that found no difference between groups (1% each, N=245, 1 RCT)41 and (0% each, N=264, 1 RCT).44

Dural tear/durotomy occurred more often with ACDF versus PCDF in one study (N=627, 9.4% vs. 3.2%, RR 3.02, 95% CI 1.50 to 6.10)46 while no events were reported in either group in another study (N=264).44

One NRSI found that PCDF was associated with a large increase in the odds of having any severe adverse event through 3 months compared with ACDF (N=3,714, 13% vs. 6.1%, OR 2.31, 95% CI 1.83 to 2.93).45

A variety of other serious adverse events were reported across five NRSIs (total N=21,813, range 546 to 13,884);42,4547,49 event rates ranged from 0.04 to 4.5 percent in the ACDF arms and from 0 to 7.7 percent in the posterior arms (PCDF or laminoplasty) and included kidney injury (4 studies)4547,49 cardiac complications (4 studies),42,46,47,49 transfusion (3 studies),4547 respiratory complications (3 studies),42,46,49 and arterial injury and hardware instrument failure malposition (1 study).42 Excluding perioperative blood transfusion in one study, which had the highest frequency of events across all these complications (N=627, 4.5% with ACDF vs. 7.7% with a posterior approach),46 the range across treatment arms was 0 to 3.7 percent (ACDF) versus 0.06 to 3.6 percent (posterior approach). There were no cases of myocardial infarction or vocal cord paralysis in one NRSI (N=245).41

3.8. Key Question 7: In patients with cervical spondylotic myelopathy due to cervical degenerative disease, what are the comparative effectiveness and harms of cervical laminectomy and fusion compared to cervical laminoplasty?

3.8.1. Key Findings

  • Evidence was inadequate to determine the effect of laminectomy versus laminoplasty on neck, shoulder, or arm pain (SOE: Insufficient).
  • There was moderate-strength evidence of little difference between laminectomy and fusion versus laminoplasty on neurologic function (SOE: Moderate) and low-strength evidence of no difference between laminectomy and fusion versus laminoplasty on general function (SOE: Low).
  • There was moderate-strength evidence of no difference in reoperation rates between laminectomy and fusion compared with laminectomy (SOE: Moderate).
  • There was low-strength evidence of fewer complications with laminoplasty compared with laminectomy and fusion (SOE: Low).

3.8.2. Description of Included Studies

Two RCTs (N=46)50,51 and 6 NRSI (N=15,523)5257 compared cervical laminectomy and fusion with cervical laminoplasty (Appendix C). The followup duration was 1 year in both of the RCTs and ranged from 1 year to 5 years in the nonrandomized studies. Trials were conducted in the United States and Egypt, with NRSI studies conducted in the United States (3 studies), Japan, China, and a multinational setting.

The mean age of participants was 58 years in one trial and not reported in the other (most participants in the second trial ranged from 50 to 59 years); mean ages in the nonrandomized studies ranged from 54 to 64 years. The average proportion of females in the trials was 30 and 58 percent; the proportion of females in the NRSI studies ranged from 21 to 55 percent. Race and ethnicity were not reported in any of the studies. One trial enrolled patients with at least 3 levels of spinal cord compression,50 while the other did not report the number of disease levels.51 Two nonrandomized studies enrolled patients with 3 or more levels of spinal cord compression,54,57 whereas the number of disease levels was not specified in the other NRSI studies.

One RCT was rated high risk of bias50 and the other was rated as moderate risk of bias.51 All of the observational studies were rated moderate risk of bias (Appendix D). The evidence comparing laminectomy and fusion with laminoplasty for neck, shoulder, and arm pain was rated insufficient due to limited and conflicting evidence (Appendix G).

3.8.3. Detailed Analysis

3.8.3.1. Fusion

No study reported fusion outcomes in the laminectomy fusion arm only.

3.8.3.2. Pain

There was inadequate evidence to determine the benefits and harms of laminectomy and fusion compared with laminoplasty on neck, shoulder, or arm pain (SOE: Insufficient).

One RCT (N=30) found a moderate benefit in neck pain with laminectomy and fusion compared with laminoplasty at 1 year (MD −1.33, p<0.05) but no difference in limb pain (MD 0.4, p>0.05).50 The other RCT (N=16) reported improvement in neck and arm pain from baseline only in patients who underwent laminoplasty (surgical approaches not directly compared, numeric values not reported, p<0.05, both outcomes).51

Among the nonrandomized studies assessing neck52,54 or shoulder52 pain, two (N=148) reported no differences in VAS scores between laminectomy and fusion and laminoplasty at 1 or 3 years.52,54 Another observational study (N=121) reported no differences in improved pain (74% vs. 60%; p=0.141) for posterior laminectomy and fusion versus laminoplasty.57

3.8.3.3. Function

3.8.3.3.1. Neurologic Function

There was moderate-strength evidence of no difference between laminectomy and fusion versus laminoplasty on neurologic function (SOE: Moderate).

Two head-to-head RCTs (N=46) assessed neurologic function with the mJOA and the Nurick Classification Scale for Spinal Cord Compression (i.e., Nurick’s grade 0 to 5) at 1 year post-operative.50,51 Pooled analysis of the two trials found no difference in function between cervical laminectomy and fusion versus laminoplasty using the mJOA (N=46, MD −0.03, 95% CI −0.68 to 0.74, I2=76%).50,51 One trial reported no significant difference between laminectomy and fusion compared with laminoplasty in Nurick grade (1.40 vs. 1.67; p=0.23),50 while the other trial reported a significant pre-post difference for laminoplasty only (numeric values not reported; p<0.05).51

Four nonrandomized studies reported neurologic function using the mJOA or JOA score; three reported no difference between laminectomy and fusion versus laminoplasty52,54,57 and one reported a significant benefit of laminoplasty over laminectomy and fusion (mean mJOA at 2 years: 3.49, 95% CI 2.84 to 4.13 vs. 2.39, 95% CI 1.91 to 2.86; p=0.0069).53 However, this study reported no significant difference in Nurick’s grade at 2 years (mean 1.57, 95% CI 1.23 to 1.90 vs. 1.18, 95% CI 0.92 to 1.44; p=0.077).

3.8.3.3.2. General Function

There was low-strength evidence of little difference between laminectomy and fusion versus laminoplasty on general function (SOE: Low).

Neck disability scores on the NDI were not different between laminectomy and fusion versus laminoplasty 1-year postoperatively (1 RCT, N=30, MD 3.86, p=0.2)50 and only improved with laminoplasty in the other trial (N=16, surgical approaches not directly compared, numeric values not reported, p=0.05).51 The same trial (N=16) reported improvement from baseline on the SF-36 with laminoplasty only (numeric values not reported, p<0.05).51,52,54 Two NRSIs reported no differences on the NDI,52,53 and three reported no differences between surgical approaches in SF-12 or SF-36 PCS or MCS scores.5254 Another observational study reported no differences in improved gait (71% vs. 68%; p=0.674) as assessed on a 5-point NRS.57

3.8.3.4. Quality of Life

No study reported quality of life outcomes.

3.8.3.5. Harms

There was moderate-strength evidence of no difference between laminectomy and fusion compared with laminectomy in reoperation rates (SOE: Moderate) and low-strength evidence of fewer complication overall with laminoplasty compared with laminectomy and fusion (SOE: Low).

Both trials reported no significant differences in harms, though event rates were low.50,51 Likewise, four NRSI studies (N=582) found no differences in infection, device failure, or reoperation rates.5254,57 A large database study (PearlDiver Mariner Database, N=11,860, unsure of matched sample size)55 reported similar revision rates for laminoplasty and laminectomy with fusion (5.63% vs. 5.90%, p=0.62) at 1 year but fewer surgical site infections (matched OR 0.60; p=0.002), wound complications (matched OR 0.67, p=0.002) and dysphagia (matched OR 0.77; p=0.01) with laminoplasty compared with laminectomy and fusion.55 Also reported in this study were reduce rates of spinal cord injury (matched OR 0.6, p=0.02), limb paralysis (matched OR 0.67, p<0.001), respiratory failure (matched OR 0.74, p=0.01), renal failure (matched OR 0.84, p=0.04), and sepsis (matched OR 0.85, p=0.04) with laminoplasty versus laminectomy and fusion. No complication was reported more likely with laminoplasty. An earlier propensity-matched analysis of patients from this same database (N=928) found lower revision rates at 1 year with laminoplasty versus laminectomy and fusion (2.4% vs. 7.1%; p<0.001).56 The dissimilar findings may be due a larger sample size (this is an assumption as the matched sample size was not reported in the later study) to changes in surgical methods and/or skill of the surgeon over time. Two additional NRSI studies reported no differences in dysphagia between groups.53,57

3.9. Key Question 8:. In patients with cervical spondylotic radiculopathy or myelopathy at one or two levels, what are the comparative effectiveness and harms of cervical arthroplasty compared to anterior cervical discectomy and fusion?

3.9.1. Key Findings

  • In participants receiving single-level interventions:
    • There was moderate-strength evidence of no difference between cervical arthroplasty and ACDF in likelihood of success (response) for any pain or function measure at short, intermediate, and long term (SOE: Moderate).
    • There were also moderate-strength evidence of no differences between cervical arthroplasty and ACDF in pain or function at short, intermediate, or long term: neck or arm pain, neurologic status or general function (SOE: Moderate).
    • There was high-strength evidence that cervical arthroplasty was associated with substantially lower likelihood of reoperation at the index level versus ACDF (SOE: High).
    • There was low-strength evidence that cervical arthroplasty was associated with slightly lower likelihood of any serious adverse event at short term versus ACDF, but there were no differences at times >24 months and serious adverse events were variably defined (SOE: Low for all times).
    • There was low-strength evidence of no differences in neurological events or deficits between cervical arthroplasty and ACDF at short, intermediate, or long term (SOE: Low).
    • There was inadequate evidence on the likelihood of mortality between cervical arthroplasty and ACDF (SOE: Insufficient).
  • In participants receiving 2-level interventions:
    • There was moderate-strength evidence of no differences between cervical arthroplasty and ACDF on pain (neck or arm), neurologic function and general function at short, intermediate, and long term (SOE: Moderate).
    • Reoperation at the index level was substantially less likely with cervical arthroplasty at all times reported (24 to >60 months) (SOE: Low).
    • Cervical arthroplasty was associated with slightly lower likelihood of serious adverse events compared with ACDF at 24 months, but there was no difference between procedures at 120 months for World Health Organization (WHO) Grade 3 or 4 (scale 0-4, 4 most serious) adverse events (SOE: Low).
    • Evidence for neurological deficits or events and for mortality was inadequate to draw conclusions (SOE: Insufficient).
  • In participants receiving 1-, 2- or 3-level interventions
    • There was no difference between cervical arthroplasty and ACDF in VAS neck pain scores at intermediate term (SOE: Low).
    • Evidence was inadequate to draw conclusions for neurologic and general function and harms (SOE: Insufficient).

3.9.2. Description of Included Studies

Twenty-two RCTs in 45 publications (N=4,120) compared cervical arthroplasty with ACDF (Appendix C).58102 The average followup duration was 56 months (range 6 to 108 months). Eight trials each were conducted in the United States65,72,75,76,86,87,93,98 and in China;6163,79,91,99101 two trials in Germany;89,90 and one trial each in India,74 the Netherlands,103 Spain,64 and Turkey.82

The average study mean age of participants was 45 years (range 37 to 50 years); the average proportion of females in studies was 47 percent (range 20% to 63%). Five trials reported race, four enrolling mostly White participants (range 89% to 93%)72,76,93,98 and the other enrolling Han (Chinese) participants.63 One trial reported ethnicity, enrolling mostly non-Hispanic participants (94%).65

Studies enrolled participants with clinical and/or radiological evidence of cervical radiculopathy and/or myelopathy, although only three trials reported baseline values.64,74,89 Participants had 1-level disease in 15 trials (N=3,036),61,75,76,79,82,86,87,8991,93,98,100,101,103 2-level disease in four trials (N=872),63,65,72,99 and mixed-level (1, 2 or 3) disease in three trials (N=196).62,64,74 Of the single-level trials, six (in 23 publications) were US Food and Drug Administration (FDA) Investigational Device Exemption (IDE) trials5860,67,68,70,7578,80,81,8487,92,93,9598,102 and of the 2-level trials, two (in 9 publications) were IDE trials.65,66,7173,80,83,94,95

Six trials were rated low risk of bias,65,76,79,86,87,93 six trials were rated high risk of bias,61,64,82,90,91,101 and the remainder were rated moderate risk of bias62,63,72,74,75,89,98100,103 (Appendix D). Methodological limitations included unclear randomization techniques, unclear blinding, and high attrition.

Two prospective, multicenter NRSIs (N=349 and N=352) of recently completed FDA IDE trials compared newer cervical arthroplasty devices (M6-C and Simplify discs) with historic ACDF controls (Appendix C).104,105 Propensity score matching was done to facilitate baseline comparability between groups. Followup was 24 months in both studies. One study enrolled participants with clinical and radiological evidence of cervical radiculopathy with or without myelopathy at 1-level105 and the other study enrolled participants with cervical radiculopathy and/or myelopathy at 2-levels.104 The study mean ages of participants were 45 years and 48 years and the proportion of females were 50 and 52 percent. Race/ethnicity was not reported by either study. The study mean body mass indexes were 27.5 and 28.9. Both studies were conducted in the United States and were rated moderate risk of bias (Appendix D).

Eight non-IDE NRSIs were included for the evaluation of harms only and included seven large database/registry studies,106112 one a post-hoc analysis of an FDA IDE trial113 (Appendix C). Sample sizes ranged from 342 to 143,060 (total N=206,887). The average study mean age of patients was 50 years (range 46 to 54 years) and the proportion of females was 51 percent (range 50% to 52%). Across three studies most patients were White (82%; range 81% to 85%); one study reported 94 percent of patients were non-Hispanic113 and four studies did not report race/ethnicity.109112 Two studies107,113 enrolled patients with radiculopathy and/or myelopathy; three studies106,111,112 specifically excluded patients with myelopathy and the remaining three studies108110 only stated that patients had CDD. Followup ranged from 30 days to 84 months. One study took place in Germany,110 and all others in the United States. Four studies were rated moderate risk of bias107,111113 and four high risk of bias106,108110 (Appendix D).

For the FDA IDE trials, an attempt was made to reconcile conflicting information among multiple reports presenting the same data and when necessary, we used the data from the FDA Summary of Safety and Effectiveness Data (SSED): 1-level114120 and 2-level indications.121123 For measures of success, we focused on the FDA required definition and reported alternative definitions as applicable. Only FDA approved devices are included for this Key Question.

In the results below for benefits, we report outcomes according to the following timeframes: short term (<12 months), intermediate term (12 to 60 months) and long term (>60 months).

Evidence was insufficient for mortality (all levels), neurologic deficit/events (2-levels and mixed 1-, 2- or 3-levels), and neurologic function, general function, reoperation and serious adverse events (mixed 1-, 2- or 3-levels) based on a combination of two or more of the following: high risk of bias, inconsistent findings, and lack of precision (Appendix G).

3.9.3. Detailed Analysis

3.9.3.1. Single-Level Cervical Arthroplasty Versus ACDF

Fifteen trials (N=3,036) (in 33 publications) compared single-level cervical arthroplasty and ACDF, including six FDA IDE trials (in 23 publications)5860,67,68,70,7578,80,81,8487,92,93,9598,102 and nine non-IDE trials (in 10 publications),61,79,82,8991,100,101,103 as did one FDA IDE NRSI.105 Six additional NRSIs compared harms for single-level cervical arthroplasty and ACDF.106110,113

3.9.3.1.1. Fusion

Seven RCTs (across 15 publications) (N=2,382) that compared single-level cervical arthroplasty and ACDF reported fusion success in their ACDF arms.59,60,68,7578,86,87,9295,98,101 One trial (N=56) reported short-term fusion success in 89.3 percent of participants,101 seven RCTs (N=853) reported intermediate-term fusion success in 93.9 percent (range 89.1% to 98.2%) of participants59,68,75,78,92,98,101 and two RCTs (N=181) reported long-term fusion success in 96.5 percent (range 95.5% to 96.9%) of participants.60,95 One RCT reported successful fusion in the cervical arthroplasty arm as well, but this may be attributed to participant crossover after initial randomization.92,93

3.9.3.1.2. Pain
3.9.3.1.2.1. Neck Pain

There was moderate-strength evidence of no differences between cervical arthroplasty and ACDF in neck pain or likelihood of success (response) for neck pain at short, intermediate, and long-term (SOE: Moderate).

Four RCTs (N=1,230) (in 5 publications)92,97,114,118,119 that compared single level cervical arthroplasty versus ACDF reported neck pain success (response) defined as postoperative ≥20-point improvement on VAS. There were no differences in likelihood of neck pain success between cervical arthroplasty and ACDF at short term (2 RCTs, N=482, 79% vs. 75.0%, RR 1.04, 95% CI 0.93 to 1.17, I2=0%),114,119 intermediate term (4 RCTs, N=948, 76.4% vs. 74.1%, RR 1.03, 95% CI 0.95 to 1.12, I2=0%)92,114,118,119 or long term (1 RCT, N=232, 85.7% vs. 78.3%, 1.09, 95% CI 0.97 to 1.24)97 (Figure 7). In one prospective NRSI IDE study using propensity-matched historical controls, more cervical arthroplasty participants had≥20-point improvement on VAS neck pain versus ACDF at 24 months (N=301, 91.2% vs. 77.9%, p=0.013).120

One of the above trials reported neck pain success at 84 months using an alternative definition, a ≥10-point improvement on VAS, and was not included in the meta-analysis at long term; there was no difference between cervical arthroplasty and ACDF using this criterion (N=191, 87.5% vs. 83.3%, RR 1.05, 95% CI 0.93 to 1.20).95

Figure 7 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.04 (95% confidence interval 0.93 to 1.17) and an overall I-squared value of 0%. Risk ratios were reported or calculated for four intermediate-term trials, with a pooled risk ratio of 1.03 (95% confidence interval 0.95 to 1.12) and an overall I-squared value of 0%. Risk ratios were reported or calculated for one long-term trial, with a risk ratio of 1.09 (95% confidence interval 0.97 to 1.24) and an overall I-squared value of 0%.

Figure 7

Neck pain success (≥20-point improvement on VAS): comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = (more...)

Eleven RCTs (N=2,696) (in 19 publications)60,61,67,69,75,78,79,81,84,86,88,89,92,9598,100,116 contributed to evaluation of mean differences in neck pain scores at various times. There were no differences between cervical arthroplasty and ACDF in VAS neck pain scores (0-100 scale) as estimates were below the threshold for a small effect at short term (8 RCTs, N=1,789, MD −3.02, 95% CI −5.53 to 0.40, I2=15.5%),61,69,75,78,86,89,98,116 intermediate term (11 RCTs, N=1,898, MD −3.39, 95% CI −6.14 to −1.23, I2=63.4%),60,61,67,69,78,79,88,92,96,98,100 and long term (5 RCTs, N=1,195, MD −4.77, 95% CI −7.62 to −1.72, I2=0%)60,81,84,95,97 (Figure 8). Exclusion of one, small (N=60) trial rated high risk of bias61 did not substantially change effect estimates but did slightly increase heterogeneity in the short term (7 RCTs, N=1,729, MD −3.11, 95 % CI −5.92 to −0.15, I2=26.6%)69,75,78,86,89,98,116 and intermediate term (10 RCTs, N=1,838, MD −3.55, 95% CI −6.48 to −1.30, I2=67.1%).60,67,69,78,79,88,92,96,98,100 Exclusion of one trial69 that did not specify if neck or arm pain was evaluated also did not substantially change effect estimates at short term (7 RCTs, N=1,714, MD −3.24, 95% CI −5.95 to −0.77, I2=12.2%)61,75,78,86,89,98,116 or intermediate term (10 RCTs, N=1,879, MD −3.51, 95% CI −6.35 to −1.33, I2=66.4%).60,61,67,78,79,88,92,96,98,100 Although funnel plot analysis and Egger’s test (p=0.035) may suggest publication/small study bias for neck pain scores at intermediate term, most trials found no effect leading to less concern regarding publication bias (Appendix F, Figure F-4).

Figure 8 is a forest plot. Mean differences were reported or calculated for eight short-term trials, with a pooled mean difference of −3.02 (95% confidence interval −5.53 to −0.40) and an overall I-squared value of 15.5%. Mean differences were reported or calculated for 11 intermediate-term trials, with a pooled mean difference of −3.39 (95% confidence interval −6.14 to −1.23) and an overall I-squared value of 63.4%. Mean differences were reported or calculated for five long-term trials, with a pooled mean difference of −4.77 (95% confidence interval −7.62 to −1.76) and an overall I-squared value of 0%.

Figure 8

Neck pain VAS scores (0-100 scale): comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; (more...)

3.9.3.1.2.2. Arm Pain

There was moderate-strength evidence of no differences between cervical arthroplasty and ACDF in arm pain or likelihood of success (response) for arm pain at short, intermediate, and long-term (SOE: Moderate).

Four RCTs (N=1,148) (in 5 publications)92,97,114,118,119 that compared cervical arthroplasty with ACDF for single level disease reported arm pain success (response) defined as postoperative ≥20-point improvement on VAS (0–100). Some studies reported arm pain success in both arms. Conservative estimates, using the lower risk ratio for studies reporting VAS for both arms, revealed no difference in likelihood of arm pain success between cervical arthroplasty and ACDF at short term (2 RCTs, N=482, 49.5% vs. 46.6%, RR 1.02, 95% CI 0.81 to 1.29, I2=0%),114,119 intermediate term (4 RCTs, N=948, 61.1% vs. 62.6%, RR 1.0, 95% CI 0.85 to 1.14, I2=37.9%),92,114,118,119 or long term (1 RCT, N=232, 85.7% vs. 75.5%, RR 1.14, 95% CI 1.00 to 1.29, I2=0%)97 (Figure 9). Estimates based on higher risk ratios for studies reporting VAS for both arms were similar and led to the same conclusion of no difference between cervical arthroplasty and ACDF for all time points. In one prospective NRSI IDE study using propensity-matched historical controls, more cervical arthroplasty participants experience≥20-point improvement on VAS arm pain (worst side) versus ACDF at 24 months (N=301, 90.5% vs. 79.9%, p=0.001).120

Figure 9 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.02 (95% confidence interval 0.81 to 1.29) and an overall I-squared value of 0%. Risk ratios were reported or calculated for four intermediate-term trials, with a pooled risk ratio of 1.00 (95% confidence interval 0.85 to 1.14) and an overall I-squared value of 37.9%. Risk ratios were reported or calculated for one long-term trial, with a risk ratio of 1.14 (95% confidence interval 1.00 to 1.29) and an overall I-squared value of 0%.

Figure 9

Arm pain success (≥20-point improvement on VAS): comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = (more...)

Nine RCTs (N=2,460) (in 17 publications)60,67,75,78,81,84,86,8890,92,9598,100,116 assessed arm pain at various times. Three publications reported pain scores for both arms. Using a conservative estimate with the smaller effect estimate of the two arms, there was no difference between cervical arthroplasty and ACDF in VAS arm pain scores (0-100 scale) short term (6 RCTs, N=1,761, MD −0.66, 95% CI −2.93 to 1.43, I2=0%),75,78,86,89,98,116 intermediate term (9 RCTs, N=1,741, MD −1.86, 95% CI −4.03 to −0.60, I2=0%),60,67,78,88,90,92,96,98,100 or long term (5 RCTs, N=1,195, MD −4.55, 95% CI −7.62 to −1.68, I2=0%)60,81,84,95,97 (Figure 10). Exclusion of one small (N=20) trial rated high risk of bias90 did not impact the effect size. Using the larger effect estimate when both arms were measured, slightly increased the estimate at short term but not the conclusion of no difference between treatments (MD −1.11, 95% CI −3.56 to 1.02); estimates at intermediate and long term were similar to the conservative estimates.

Figure 10 is a forest plot. Mean differences were reported or calculated for six short-term trials, with a pooled mean difference of −0.66 (95% confidence interval −2.93 to 1.43) and an overall I-squared value of 0%. Mean differences were reported or calculated for nine intermediate-term trials, with a pooled mean difference of −1.86 (95% confidence interval −4.03 to −0.60) and an overall I-squared value of 0%. Mean differences were reported or calculated for five long-term trials, with a pooled mean difference of −4.55 (95% confidence interval −7.62 to −1.68) and an overall I-squared value of 0%.

Figure 10

Arm pain VAS scores (0-100): comparison of cervical arthroplasty with ACDF (1-level). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA = Food and Drug Administration; (more...)

3.9.3.1.3. Function
3.9.3.1.3.1. Neurologic Function

There was moderate-strength evidence of no differences between cervical arthroplasty and ACDF in neurologic function at short, intermediate, and long term (SOE: Moderate).

Six RCTs (N=2,271) (in 15 publications)60,78,81,84,86,92,9598,102,114,116,118,119 that compared single-level cervical arthroplasty and ACDF reported neurologic success (response) defined as maintenance or improvement (compared with preoperative status) in all three of the following areas: motor function, sensory function and deep tendon reflexes. There were no differences between cervical arthroplasty and ACDF in the likelihood of neurological success short-term (5 RCTs, N=1,493, 95.2% vs. 90.5%, RR 1.04, 95% CI 1.01 to 1.08, I2=0%),86,114,116,118,119 intermediate term (6 RCTs, N=1,574, 93.3% vs. 89.5%, RR 1.03, 95% CI 1.00 to 1.06, I2=0%),60,78,92,96,98,102 or long term (5 RCTs, N=1,180, 89.9% vs. 86.6%, RR 1.02, 95% CI 0.97 to 1.09, I2=43.3%)60,81,84,95,97 (Figure 11). One prospective NRSI IDE study that used propensity matched ACDF historical controls reported neurological success, defined as maintenance or improvement compared with baseline, was similar for cervical arthroplasty and ACDF at 24 months (N=314, 99.3% vs. 98.8%).120

Figure 11 is a forest plot. Risk ratios were reported or calculated for five short-term trials, with a pooled risk ratio of 1.04 (95% confidence interval 1.01 to 1.08) and an overall I-squared value of 0%. Risk ratios were reported or calculated for six intermediate-term trials, with a pooled risk ratio of 1.03 (95% confidence interval 1.00 to 1.06) and an overall I-squared value of 0%. Risk ratios were reported or calculated for five long-term trials, with a pooled risk ratio of 1.02 (95% confidence interval 0.97 to 1.09) and an overall I-squared value of 43.3%.

Figure 11

Neurological success: comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA = Food and (more...)

Four RCTs (N=354), three rated high risk of bias63,91,101 and one low risk of bias,79 reported JOA scores (0-17). There was no differences between cervical arthroplasty and ACDF in pooled analysis at intermediate term (4 RCTs, N=354, MD 0.60, 95% CI −0.007 to 0.97, I2=1.9%) or in one short-term trial rated high risk of bias (1 RCT, N=60, MD 0.25, 95% CI −0.25 to 0.75).63

One trial reported the proportion of participants who had the same or an improved Nurick grade at 60 months compared with baseline; there were no differences (i.e., point estimate below the threshold for a small effect) between cervical arthroplasty and ACDF (N=285, 99.4% vs. 96.9%, RR 1.03, 95% CI 0.99 to 1.06).93

3.9.3.1.3.2. General Function

There was moderate-strength evidence of no differences between cervical arthroplasty and ACDF in general function at short, intermediate, and long term (SOE: Moderate).

3.9.3.1.3.2.1. NDI

Six RCTs (N=2,271) (in 14 publications)60,78,84,86,87,92,9598,114,116,118,119 that compared cervical arthroplasty with ACDF for single-level disease reported NDI success (response) defined as postoperative NDI score improvement of ≥15 points from the baseline score (FDA definition). There were no differences between cervical arthroplasty and ACDF in the likelihood of NDI success short term (6 RCTs, N=1,900, 85.2% vs. 79.0%, RR 1.07, 95% CI 1.01 to 1.13, I2=31.6%),86,96,114,116,118,119 intermediate term (6 RCTs, N=1,678, 82.9% vs. 78.2%, RR 1.07, 95% CI 1.01 to 1.14, I2=8.4%),60,78,87,92,96,98 or long term (4 RCTs, N=1,047, 86.4% vs. 80.8%, RR 1.06, 95% CI 0.99 to 1.15, I2=35.5%)60,84,95,97 (Figure 12). In one prospective NRSI IDE study that used propensity-matched historical controls, there was no difference in NDI success (≥15-point NDI improvement) following cervical arthroplasty versus ACDF at 24 months (N=301, 90.5% vs. 85.1%, p=0.372).120

Figure 12 is a forest plot. Risk ratios were reported or calculated for six short-term trials, with a pooled risk ratio of 1.07 (95% confidence interval 1.01 to 1.13) and an overall I-squared value of 31.6%. Risk ratios were reported or calculated for six intermediate-term trials, with a pooled risk ratio of 1.07 (95% confidence interval 1.01 to 1.14) and an overall I-squared value of 8.4%. Risk ratios were reported or calculated for four long-term trials, with a pooled risk ratio of 1.06 (95% confidence interval 0.99 to 1.15) and an overall I-squared value of 35.5%.

Figure 12

NDI success (≥15-point improvement): comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; (more...)

Twelve RCTs (N=2,800) (in 19 publications)60,61,67,69,75,78,79,81,82,84,86,92,93,9598,100,101 that compared cervical arthroplasty with ACDF reported NDI scores (0-100 scale). There were no differences between cervical arthroplasty and ACDF in NDI scores as estimates were below the threshold for a small effect at short term (8 RCTs, N=2,125, MD −3.13, 95% CI −4.29 to −1.99, I2=0%),61,67,69,75,78,86,93,97 intermediate term (12 RCTs, N=2,027, MD −2.10, 95% CI −3.94 to −0.35, I2=49.3%),60,61,67,69,78,79,82,92,96,98,100,101 or long term (6 RCTs, N=1,291, MD −3.30, 95% CI −5.13 to −1.02, I2=0%)60,69,81,84,95,97 (Figure 13). Exclusion of trials rated high risk of bias61,82,101 had no impact on effect estimates or statistical heterogeneity in the short term (7 RCTs, N=2,065, MD −3.14, 95% CI −4.30 to −1.99, I2=0%)67,69,75,78,86,93,97 and slightly increased effect size and increased heterogeneity at intermediate term (9 RCTs, N=1,814, MD −2.45, 95% CI −4.70 to −0.35, I2=62.5%).60,67,69,78,79,92,96,98,100 Exclusion of a trial rated moderate risk of bias69 with unclear sample sizes resulted in a small increase in effect size long term (5 RCT, N=1,288, MD −3.78, 95% CI −5.74 to −1.54).60,81,84,95,97 There was no indication of publication/small study bias for NDI scores at intermediate term based on funnel plot analysis (Egger’s test, p=0.416) (Appendix F, Figure F-5).

Figure 13 is a forest plot. Mean differences were reported or calculated for eight short-term trials, with a pooled mean difference of −3.13 (95% confidence interval −4.29 to −1.99) and an overall I-squared value of 0%. Mean differences were reported or calculated for 12 intermediate-term trials, with a pooled mean difference of −2.10 (95% confidence interval −3.94 to −0.35) and an overall I-squared value of 49.3%. Mean differences were reported or calculated for six long-term trials, with a pooled mean difference of −3.30 (95% confidence interval −5.13 to −1.02) and an overall I-squared value of 0%.

Figure 13

NDI scores (0-100): comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA = Food and Drug (more...)

3.9.3.1.3.2.2. SF-36 and SF-12 PCS and MCS

Four RCTs (N=1,148) (in 6 publications)92,97,98,114,118,119 that compared cervical arthroplasty with ACDF for single-level disease reported SF-36 and SF-12 PCS and MCS (0-100 scale). Success for these component scores was defined as postoperative score improvement of ≥15 points from baseline scores. The likelihood of PCS success was similar for cervical arthroplasty and ACDF short term (2 RCTs, N=466, 81.7% vs. 75.9%, RR 1.08, 95% CI 0.96 to 1.23, I2=0%),114,119 intermediate term (4 RCTs, N=939, RR 1.16, 95% CI 1.00 to 1.41, I2=61.2%),92,98,114,118 and long term (1 RCT, N=231, 72.0% vs. 74.5%, 0.97, 95% CI 0.83 to 1.13)97 (Figure 14). Exclusion of one outlier trial118 at intermediate term resulted in a slightly attenuated effect estimate but did not reduce heterogeneity or change the conclusion (3 RCTs, N=750, RR 1.12, 95% CI 0.96 to 1.34, I2=59.8%).92,98,114 In one prospective NRSI IDE study using propensity-matched historical controls, more cervical arthroplasty participants maintained or improved PCS score versus ACDF at 24 months (N=301, 97.3% vs. 89.2%, p=0.023).120 The likelihood of MCS success was also similar for cervical arthroplasty and ACDF at all time points: short term (2 RCTs, N=466, 49.1% vs. 42.8%, RR 1.13, 95% CI 0.86 to 1.50, I2=0%),114,119 intermediate term (4 RCTs, N=939, 47.3% vs. 48%, RR 0.97, 95% CI 0.80 to 1.16, I2=27.5%)92,98,114,118 and long term (1 RCT, N=231, 47.2% vs. 43.4%, RR 1.09, 95% CI 0.82 to 1.45)97 (Figure 15). In the prospective NRSI IDE study, there was no difference in MCS maintenance or improvement between procedures at 24 months (N=301, 77.6% vs. 77.0%).120

Figure 14 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.08 (95% confidence interval 0.96 to 1.23) and an overall I-squared value of 0%. Risk ratios were reported or calculated for four intermediate-term trials, with a pooled risk ratio of 1.16 (95% confidence interval 1.00 to 1.41) and an overall I-squared value of 61.2%. Risk ratios were reported or calculated for one long-term trial, with a risk ratio of 0.97 (95% confidence interval 0.83 to 1.13) and an overall I-squared value of 0%.

Figure 14

SF-36 or SF-12 PCS success (≥15-point improvement): comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI (more...)

Figure 15 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.13 (95% confidence interval 0.86 to 1.50) and an overall I-squared value of 0%. Risk ratios were reported or calculated for four intermediate-term trials, with a pooled risk ratio of 0.97 (95% confidence interval 0.80 to 1.16) and an overall I-squared value of 27.5%. Risk ratios were reported or calculated for one long-term trial, with a risk ratio of 1.09 (95% confidence interval 0.82 to 1.45) and an overall I-squared value of 0%.

Figure 15

SF-36 or SF-12 MCS success (≥15-point improvement): comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI (more...)

Seven RCTs (N=2,368) (in 14 publications)60,69,75,77,78,81,84,86,92,9598,116 that compared cervical arthroplasty with ACDF reported SF-36/12 PCS and MCS scores (0-100 scale). There were no differences between cervical arthroplasty and ACDF in PCS scores (Figure 16) as estimates were below the threshold for a small effect in the short-term (6 RCTs, N=1,779, MD 1.67, 95% CI 0.59 to 2.87, I2=0%), intermediate term (7 RCTs, N=1,684, MD 2.13, 95% CI 0.77 to 3.33, I2=0%), or long term (5 RCTs, N=1,191, MD 1.76, 95% CI 0.44 to 3.07, I2=0%). Similarly, there were no differences between cervical arthroplasty and ACDF in MCS scores (Figure 17) as estimates were below the threshold for a small effect in the short-term (6 RCTs, N=1,779, MD 1.14, 95% CI −0.14 to 2.17, I2=0%), intermediate term (7 RCTs, N=1,814, MD 0.83, 95% CI −0.75 to 2.41, I2=32.2%), and long term (3 RCTs, N=574, MD 0.64, 95% CI −1.47 to 2.82, I2=0%). Effect estimates for PCS and MCS did not differ following the exclusion of one trial with unclear samples sizes.69 No studies were rated high risk of bias.

Figure 16 is a forest plot. Mean differences were reported or calculated for six short-term trials, with a pooled mean difference of 1.67 (95% confidence interval 0.59 to 2.87) and an overall I-squared value of 0%. Mean differences were reported or calculated for seven intermediate-term trials, with a pooled mean difference of 2.13 (95% confidence interval 0.77 to 3.33) and an overall I-squared value of 0%. Mean differences were reported or calculated for five long-term trials, with a pooled mean difference of 1.76 (95% confidence interval 0.44 to 3.07) and an overall I-squared value of 0%.

Figure 16

SF-36 or SF-12 PCS scores (0-100): comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA (more...)

Figure 17 is a forest plot. Mean differences were reported or calculated for six short-term trials, with a pooled mean difference of 1.14 (95% confidence interval −0.14 to 2.17) and an overall I-squared value of 0%. Mean differences were reported or calculated for seven intermediate-term trials, with a pooled mean difference of 0.83 (95% confidence interval −0.75 to 2.41) and an overall I-squared value of 32.2%. Mean differences were reported or calculated for three long-term trials, with a pooled mean difference of 0.64 (95% confidence interval −1.47 to 2.82) and an overall I-squared value of 0%.

Figure 17

SF-36 or SF-12 MCS scores (0-100): comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA (more...)

3.9.3.1.3.2.3. Odom’s Criteria

Four RCTs (N=553)61,82,91,93 used Odom’s criteria to categorize overall improvement as excellent (i.e., all pre-operative symptoms relieved, abnormal findings improved), good (i.e., minimal persistence of symptoms, abnormal findings unchanged or improved), fair (i.e., definite relief of some symptoms, others unchanged or slightly improved) or poor (i.e., symptoms and signs unchanged or exacerbated). There were no differences between single-level cervical arthroplasty and ACDF in the likelihood of having excellent or good results based on Odom’s criteria (4 RCTs, N=847, 48.3% vs. 46.8%, RR 1.01, 95% CI 0.92 to 1.12, I2=0%) at intermediate term.61,82,91,93 However, three of the RCTs (all small) were rated high risk of bias,61,82,91 while the one large RCT was rated moderate risk of bias.93 Based on the highest quality trial, there was no difference between procedures in the likelihood of having excellent or good improvement (1 RCT, N=682, 45.7% vs. 43.1%)93 (Figure 18). In one prospective NRSI IDE study using propensity-matched historical controls, there was no difference between cervical arthroplasty and ACDF in the likelihood of having excellent or good results using Odom’s criteria at 24 months (N=301, 90.5% vs. 79.9%).120

Figure 18 is a forest plot. Risk ratios were reported or calculated for four excellent or good criteria studies, with a pooled risk ratio of 1.01 (95% confidence interval 0.92 to 1.12) and an overall I-squared value of 0%. Risk ratios were reported or calculated for three fair criteria studies, with a pooled risk ratio of 0.88 (95% confidence interval 0.47 to 1.87) and an overall I-squared value of 0%. Risk ratios were reported or calculated for two poor criteria studies, with a pooled risk ratio of 0.31 (95% confidence interval 0.03 to 5.38) and an overall I-squared value of 36.2%.

Figure 18

Odom’s criteria: comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; mos. = months; (more...)

3.9.3.1.3.3. Overall Success (Composite)

The FDA IDE trials were required to report overall success, a composite outcome for six RCTs (N=2,271) (in 11 publications)60,75,78,84,86,87,93,96,98,118,119 that included a threshold of ≥15-point NDI improvement (0-50 scale) from baseline, improvement or maintenance of neurologic status, no serious adverse events and no additional surgical procedures that might be considered “failure” (e.g., removal, revision, supplemental fixation). In participants with single-level interventions, effect estimates were below the threshold for a small effect and classified as no difference in overall success comparing cervical arthroplasty with ACDF in the short term (4 RCTs, N=1,361, 79.9% vs. 71.7%, RR 1.11, 95% CI 1.04 to 1.18, I2=0%)75,86,118,119 and intermediate term (6 RCTs, N=1,717, 76.1% vs. 67.7%, RR 1.14, 95% CI 1.07 to 1.20, I2=0%);60,78,87,93,96,98 but a slightly increased likelihood of overall success favoring cervical arthroplasty was seen long term (3 RCTs, N=878, 76.1% vs. 67.7%, RR 1.21, 95% CI 1.11 to 1.32, I2=0%)60,84,98 (Figure 19). In one prospective NRSI IDE study using propensity-matched historical controls, there was no difference between cervical arthroplasty and ACDF in overall response (same definition as in RCTs) at 24 months (N=301, 86.8% vs. 79.3%, p=0.265).120

One of the above trials reported overall success at 84 months using a different criterion for NDI (improvement in NDI score ≥30 points if preoperative score ≥60 or improvement of ≥50% if preoperative score <60) and included an additional requirement for radiographic success, and was not included in the meta-analysis at long term; there was no difference between cervical arthroplasty and ACDF using this criteria (N=166, 55.2% vs. 50.0%, RR 1.10, 95% CI 0.80 to 1.52).95

Figure 19 is a forest plot. Risk ratios were reported or calculated for four short-term trials, with a pooled risk ratio of 1.11 (95% confidence interval 1.04 to 1.18) and an overall I-squared value of 0%. Risk ratios were reported or calculated for six intermediate-term trials, with a pooled risk ratio of 1.14 (95% confidence interval 1.07 to 1.20) and an overall I-squared value of 0%. Risk ratios were reported or calculated for three long-term trials, with a pooled risk ratio of 1.21 (95% confidence interval 1.11 to 1.32) and an overall I-squared value of 0%.

Figure 19

Overall success: comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA = Food and Drug (more...)

3.9.3.1.3.4. Quality of Life

None of the included studies reported on quality-of-life measures.

3.9.3.1.3.5. Reoperation and Subsequent Surgery

There was high-strength evidence that cervical arthroplasty was associated with substantially lower likelihood of reoperation that included the index level versus ACDF (SOE: High). Rates of reoperation for ACDF at the index level may be influenced by the need to remove an existing plate to treat adjacent segment disease (ASD), rather than the indication for reoperation being driven by an issue at the index procedure. This may artificially inflate the reported reoperation rate at the index procedure level for ACDF when compared with cervical arthroplasty. Studies were not consistently clear in the indication for reoperation. The clinical relevance of removing the plate as a part of a procedure addressing ASD is minimal.

Reoperation including any additional procedure at the index level was substantially less frequent with cervical arthroplasty versus ACDF for single-level disease at all time points reported in RCTs including short term up to 24 months (9 RCTs, N=2,323, 2.9% vs. 6.2%, RR 0.49, 95% CI 0.28 to 0.80, I2=16.2%)60,76,82,87,90,96,98,100,116 and long term from 84 to 120 months (7 RCTs, N=1,992, 5.2% vs. 12.5%, RR 0.44, 95% CI 0.29 to 0.60, I2=0%)60,69,81,85,92,95,97 (Figure 20).

Figure 20 is a forest plot. Risk ratios were reported or calculated for nine trials with 24-month follow-up, with a pooled risk ratio of 0.49 (95% confidence interval 0.28 to 0.80) and an overall I-squared value of 16.2%. Risk ratios were reported or calculated for three trials with 36-to-48-month follow-up, with a pooled risk ratio of 0.50 (95% confidence interval 0.22 to 0.98) and an overall I-squared value of 0%. Risk ratios were reported or calculated for four trials with 60-month follow-up, with a pooled risk ratio of 0.39 (95% confidence interval 0.15 to 0.71) and an overall I-squared value of 37.4%. Risk ratios were reported or calculated for seven trials with more than 60-month follow-up, with a pooled risk ratio of 0.44 (95% confidence interval 0.29 to 0.60) and an overall I-squared value of 0%.

Figure 20

Reoperation involving the index level: comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; (more...)

One prospective NRSI IDE study of cervical arthroplasty using historical ACDF controls found no difference in index-level reoperation up to 24 months (N=349, 1.9% vs. 4.8%, RR 0.39, 95% CI 0.11 to 1.43).120

Reoperation across two NRSIs was less common than that reported in RCTs. No difference in 30-day reoperation was seen in one NRSI (1.2% vs. 0.4%, adjusted OR 0.60, 95% CI 0.14 to 2.56).109 Another NRSI reported that reoperation was less common following cervical arthroplasty within 90 days of index surgery compared with ACDF (2.04% vs. 3.35%, adjusted OR 0.63, 95% CI 0.44 to 0.92) but no difference between cervical arthroplasty and ACDF longer-term up to 5 years (adjusted hazard ratio 0.86, 95% CI 0.60 to 1.23).108 While overall reoperation rates were lower in these database NRSIs, it is possible the RCTs, particularly IDE trials may provide more accurate detail regarding specific indications.

Subsequent surgery rates at adjacent levels were similar between cervical arthroplasty and ACDF at up to 24 months82,86,98,100,114116,118 and between 36 and 48 months (including after exclusion of one trial rated high risk of bias101)89,96,101,116 but was substantially less likely with cervical arthroplasty versus ACDF at 60 months (3 RCTs, N=1,010, 2.5% vs. 6.2%, RR 0.39, 95% CI 0.15 to 0.84, I2=8.7%)59,68,80 and at the longest followups from 84 to 120 months (6 RCTs, N=1,606, 5.0% vs. 13.5%, RR 0.39, 95% CI 0.25 to 0.56, I2=1.5%).60,69,81,85,95,97 However, estimates were somewhat imprecise (Figure 21). Also, across trials, indications for operation at adjacent levels were not consistently described.

Figure 21 is a forest plot. Risk ratios were reported or calculated for eight trials with 24-month follow-up, with a pooled risk ratio of 0.61 (95% confidence interval 0.28 to 1.12) and an overall I-squared value of 1.7%. Risk ratios were reported or calculated for four trials with 36-to-48-month follow-up, with a pooled risk ratio of 0.61 (95% confidence interval 0.22 to 1.19) and an overall I-squared value of 0%. Risk ratios were reported or calculated for three trials with 60-month follow-up, with a pooled risk ratio of 0.39 (95% confidence interval 0.15 to 0.84) and an overall I-squared value of 8.7%. Risk ratios were reported or calculated for six trials with more than 60-month follow-up, with a pooled risk ratio of 0.39 (95% confidence interval 0.25 to 0.56) and an overall I-squared value of 1.5%.

Figure 21

Subsequent surgery at adjacent levels: comparison of cervical arthroplasty versus ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; (more...)

3.9.3.1.3.6. Harms

All 15 RCTs that evaluated cervical arthroplasty and ACDF for single-level disease provided information on adverse events and harms up to 120 months followup.60,61,69,76,79,82,87,89,90,92,96,98,100,101,116 Information on harms from four NRSIs was used to complement that from RCTs.106,108,109,120

3.9.3.1.3.6.1. Neurologic Deficit

There was low-strength evidence of no differences in the likelihood of neurological events or deficits between cervical arthroplasty and ACDF in the short, intermediate, or long term (SOE: Low).

Reporting of neurological events varied across RCT publications. Three publications assessed events from the Bryan IDE trial at different times;58,85,96 one IDE trial evaluated Mobi-C.95 One trial58 described specific, observed neurological events as acute neurological changes, while other trials used various general terms to describe neurologic events (e.g., new deficit, neurological failure, neurological adverse event). The timing of events following surgery was also not clearly reported. Thus, reported proportions of participants who experienced neurological events varied substantially across RCTs, however there were no differences between cervical arthroplasty and ACDF at 0 to 24 months (3.3% vs. 3.2%),58 between 24 and 48 months (0% vs. 1.0%, WHO grade 3 or 4),96 up to 84 months (11.4 % vs. 11.5%),95 or up to 120 months (any: 43.1% vs. 43.8%; WHO grade 3 or 4: 4.5% vs. 6.9%).85 One prospective NRSI IDE study of cervical arthroplasty that used propensity-matched historical ACDF controls reported no differences in serious device- or procedure-related neurological adverse events between cervical arthroplasty and ACDF (1.3% vs. 1.6%) through 24 months.120 The same trial study also reported fewer cervical arthroplasty participants experienced neurological decrease from baseline versus ACDF (6.7% vs. 12.8%, RR 0.52, 95% CI 0.25 to 1.07) but results were imprecise.

3.9.3.1.3.6.2. Mortality

There was inadequate evidence to draw conclusions on the likelihood of death in participants undergoing cervical arthroplasty versus ACDF (SOE: Insufficient).

Death was uncommon (<3%) in RCTs and NRSIs, with no reported differences between cervical arthroplasty and ACDF. Across RCTs, no deaths were directly attributed to either procedure, however cause of death was not reported in many trials. For cervical arthroplasty from 0 to 24 months, three of the four deaths were attributed to myocardial infarction or cardiac arrest in one trial;60 the cause of the fourth death was not reported in another trial.98 No deaths were observed in one trial.76 At followup from 0 to 36 months, one cervical arthroplasty participant died of a severe subarachnoid hemorrhage at 6 weeks (relationship to procedures was not stated)89 and one death in the ACDF group attributed to a motor vehicle accident was observed in another trial.58 There was no difference in mortality between procedures at 84 months (1 RCT, N=541, 0.9% vs. 2.2%, RR 0.38, 95% CI 0.08 to 1.96)60 or at 120 months (1 RCT, N=232, 1.4% vs. 2.4%, RR 0.54, 95% CI 0.09 to 3.18),85 however estimates were imprecise. Findings from one large administrative data NRSI108 reinforce that death was rare for cervical arthroplasty (0%) and ACDF (0.18%) and that there was no difference between procedures in the likelihood of mortality. One death occurred in the cervical arthroplasty group in one NRSI IDE study using historical controls up to 24 months120 (Appendix C).

3.9.3.1.3.6.3. Serious Adverse Events

There was low-strength evidence that cervical arthroplasty was associated with a slightly lower likelihood of any serious adverse event in the short term versus ACDF (SOE: Low); there was also low-strength of no differences in the likelihood of experiencing a serious adverse events at greater than 24 months (SOE: Low).

Serious adverse event definitions and types of events varied across RCTs, but often included events that were life threatening, required medical intervention, or resulted in a permanent disability or death. Timing of events was not reported. Events related to participant factors such as comorbidities (e.g., underlying cardiovascular disease) would likely not be different between procedures. Cervical arthroplasty was associated with a slightly lower likelihood of experiencing a serious adverse event up to 24 months across IDE trials (5 RCTs, N=1,611, 24.6% vs. 30.6%, RR 0.83, 95% CI 0.64 to 0.97, I2= 24.6%)58,76,87,93,98 compared with ACDF, however across fewer trials at other times, no differences between procedures was seen (Figure 22). No difference in the likelihood of experiencing a serious adverse events was seen between cervical arthroplasty and ACDF (N=349, 9.4% vs. 14.8%, RR 1.97, 95% CI 0.88 to 4.37) in one NRSI IDE study using historical controls up to 24 months.

Figure 22 is a forest plot. Risk ratios were reported or calculated for five trials with 24-month follow-up, with a pooled risk ratio of 0.83 (95% confidence interval 0.64 to 0.97) and an overall I-squared value of 24.6%. Risk ratios were reported or calculated for two trials with 36-to-48-month follow-up, with a pooled risk ratio of 0.93 (95% confidence interval 0.71 to 1.24) and an overall I-squared value of 0%. Risk ratios were reported or calculated for one trial with 60 month follow-up, with a risk ratio of 1.21 (95% confidence interval 0.81 to 1.81) and an overall I-squared value of 100%. Risk ratios were reported or calculated for two trials with more than 60 month follow-up, with a pooled risk ratio of 0.90 (95% confidence interval 0.73 to 1.06) and an overall I-squared value of 0%.

Figure 22

Any serious adverse events (author defined): comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; (more...)

Dysphagia was reported by six RCTs (N=1,965) (in 8 publications),58,60,68,69,76,81,85,98 but the severity was unclear in most cases. One trial (N=463) reported no cases of WHO grade 3 or 4 dysphagia in any participant through 24 months followup.58

NRSIs based on administrative data suggest that serious adverse events are rare and not different between cervical arthroplasty and ACDF. Thrombolic event rates (DVT and/or pulmonary embolism) were similar between cervical arthroplasty (range 0.07% to 0.19%) and ACDF (0.10% to 0.11%) as reported by two large NRSIs.106,108 One NRSI108 reported rates of vertebral artery injury and dural tear of less than 1 percent in for each procedure. One NRSI reported low risk of dysphagia (0% vs. 0.13%)109 but did not report dysphagia severity. Dysphagia was more common in cervical arthroplasty participants versus ACDF participants (9.4% vs. 6.3%) but severity was not described in one prospective NRSI IDE study using historical ACDF controls.120

3.9.3.1.3.6.4. Heterotopic Ossification

Grade 3 or 4 heterotopic ossification (HO), considered clinically relevant HO by most of the trials, may be of concern with cervical arthroplasty. Across five RCTs (N=525 for cervical arthroplasty arm, range 30 to 182), 9.5 percent of participants (range, 1.8% to 12.8%) developed Grade 3 or 4 HO across 24 to 84 months followup.61,92,95,97,100 In addition, one FDA IDE NRSI (n=150 in cervical arthroplasty arm) reported rates of grade 3 or 4 HO at 24 months (11.3%; 0.7%, grade 4).105 Rates of Grade 1 or 2 (or unclear grades) of HO ranged from 0 to 32.7 percent across seven trials (N range for cervical arthroplasty arms, 51 to 201) over 12 to 84 months followup60,61,76,79,81,100,101 and was 44 percent at 24 months in the NRSI.105

3.9.3.1.3.6.5. Device-Related Adverse Events

Device-related adverse event definitions, types of events and adjudication varied across RCTs. Some trials included a range of events such as adjacent-level degenerative joint changes, headache as well as neurological events. Some device-related events may only occur with cervical arthroplasty, others may only occur with ACDF (e.g., nonunion). Some events may not be persistent or serious (e.g., superficial wound infection, dysphagia). Cervical arthroplasty was associated with substantially lower likelihood of device-related events at 24 months (6 RCTs, N=2,167, 4.9% vs. 11%, RR 0.46, 95% CI 0.31 to 0.63, I2=0%).77,115119 No difference was seen across two trials at 60 months,78,102 but results across three trials at >60 months81,85,97 were inconsistent (Figure 23).

Figure 23 is a forest plot. Risk ratios were reported or calculated for six trials with 24-month follow-up, with a pooled risk ratio of 0.46 (95% confidence interval 0.31 to 0.63) and an overall I-squared value of 0%. Risk ratios were reported or calculated for two trials with 60-month follow-up, with a pooled risk ratio of 1.06 (95% confidence interval 0.14 to 4.39) and an overall I-squared value of 21.1%. Risk ratios were reported or calculated for three trials with more than 60-month follow-up, with a pooled risk ratio of 0.87 (95% confidence interval 0.21 to 3.59) and an overall I-squared value of 88.5%.

Figure 23

Device-related adverse events: comparison of cervical arthroplasty with ACDF (1-level interventions). ACDF = anterior cervical discectomy and fusion; CI = confidence interval; mos. = months; PL = profile likelihood; SSED = Summary of Safety and Effectiveness (more...)

3.9.3.1.3.6.6. Differential Effectiveness (Heterogeneity of Treatment Effect [HTE])

None of the included trials that compared single-level cervical arthroplasty and ACDF interventions reported differential effectiveness based on patient or other characteristics.

3.9.3.2. Two-Level Cervical Arthroplasty Versus ACDF

Four RCTs (N=872) (in 11 publications)63,65,66,7173,80,83,94,95,99 compared two-level cervical arthroplasty and ACDF, including two FDA IDE trials (in 9 publications)65,66,7173,80,83,94,95 and two non-IDE trials.63,99 One FDA IDE NRSI104 compared a novel polyetheretherketone (PEEK)-on-ceramic cervical arthroplasty with propensity score-matched historical ACDF controls (structural allograft and plate) from a multicenter RCT initiated in the mid-2000s that was not referenced.

3.9.3.2.1. Fusion

Two RCTs (N=727) (across 4 publications) that compared two-level cervical arthroplasty and ACDF procedures reported fusion success in their ACDF arms.71,83,94,95 No trials reported short-term fusion success. Two RCTs (N=243) reported intermediate-term fusion success in 92.5 percent (range: 90.5% to 94.0%) of participants.83,94 Two RCTs (N=196) reported long-term fusion success in 92.6 percent (range: 90.9% to 93.8%) of participants.71,95 One IDE NRSI104 comparing a novel cervical arthroplasty versus historical ACDF controls reported pseudarthrosis in 6.5 percent of the ACDF group.

3.9.3.2.2. Pain
3.9.3.2.2.1. Neck Pain

There was moderate-strength evidence of no difference between cervical arthroplasty and ACDF on neck pain (SOE: Moderate).

Two RCTs (N=727)121,122 that compared cervical arthroplasty with ACDF reported neck pain success (response) defined as postoperative ≥20-point improvement on VAS (0-100 scale). In participants having two-level interventions there were no differences in likelihood of neck pain success between cervical arthroplasty and ACDF in the short term (2 RCTs, N=692, 88% vs. 80.7%, RR 1.10, 95% CI 1.01 to 1.23, I2= 0.8%),121,122 intermediate term (2 RCTs, N=678, 86.9% vs. 83.3%, RR 1.06, 95% CI 0.98 to 1.15, I2=0%),121,122 and long term (1 RCT, N=221, 91.2% vs. 81.3%, RR 1.12, 95% CI 1.01 to 1.25)122 as estimates were below the threshold for a small effect (Figure 24). There was also no difference long term between cervical arthroplasty and ACDF in the trial using a threshold of ≥10-point improvement for neck pain success that was not included in the meta-analysis (1 RCT, N=269, 86% vs 77.7%, RR 1.11, 95% CI 0.97 to 1.32).95

Figure 24 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.10 (95% confidence interval 1.01 to 1.23) and an overall I-squared value of 0.8%. Risk ratios were reported or calculated for two intermediate-term trials, with a pooled risk ratio of 1.06 (95% confidence interval 0.98 to 1.15) and an overall I-squared value of 0%. Risk ratios were reported or calculated for one long-term trial, with a risk ratio of 1.12 (95% confidence interval 1.01 to 1.25) and an overall I-squared value of 0%.

Figure 24

Neck pain success (≥20-point improvement on VAS): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = (more...)

There was no difference in VAS neck pain scores (0-100 scale) between cervical arthroplasty and ACDF short term (3 RCTs, N=764, MD −5.83, 95% CI −12.28 to 0.61, I2=50.3%).72,94,99 Cervical arthroplasty was associated with a small pain improvement versus ACDF in the intermediate term (4 RCTs, N=707, MD −8.21, 95% CI −13.83 to −4.25, I2=23%)63,71,94,99 and long term (3 RCTs N=615, MD −8.13, 95% CI −15.18 to −2.97, I2=55.9%)71,95,99 (Figure 25). One IDE NRSI that compared a novel cervical arthroplasty versus historical ACDF controls reported no differences in mean VAS neck pain intensity at short- or intermediate term (N=352, 1.8 vs. 2.5 at both times, p>0.10).104

Figure 25 is a forest plot. Mean differences were reported or calculated for three short-term trials, with a pooled mean difference of −5.83 (95% confidence interval −12.28 to 0.61) and an overall I-squared value of 50.3%. Mean differences were reported or calculated for four intermediate-term trials, with a pooled mean difference of −8.21 (95% confidence interval −13.83 to −4.25) and an overall I-squared value of 23%. Mean differences were reported or calculated for three long-term trials, with a pooled mean difference of −8.13 (95% confidence interval −15.18 to −2.97) and an overall I-squared value of 55.9%.

Figure 25

Neck pain scores (0-100): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA = Food (more...)

3.9.3.2.2.2. Arm Pain

There was moderate-strength evidence of no difference between cervical arthroplasty and ACDF on arm pain (SOE: Moderate).

Two RCTs (N=727)121,122 that compared cervical arthroplasty with ACDF reported arm pain success (response) defined as postoperative ≥20-point improvement on VAS (0-100 scale). Some studies reported arm pain success in both arms. Using conservative estimates (the lower risk ratio), there were no differences in likelihood of arm pain success between cervical arthroplasty and ACDF at short term (2 RCTs, N=692, 70.6% vs. 74.1%, RR 1.0, 95% CI 0.90 to 1.14, I2= 0%),121,122 intermediate term (2 RCTs, N=678, 71.9% vs.74.0%, RR 1.02, 95% CI 0.92 to 1.14, I2= 0%),121,122 or long term (1 RCT, N=220, RR 0.94, 95% CI 0.84 to 1.05)122 (Figure 26). Estimates and conclusions using the higher risk ratios from the other arm were similar.

Figure 26 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.00 (95% confidence interval 0.90 to 1.14) and an overall I-squared value of 0%. Risk ratios were reported or calculated for two intermediate-term trials, with a pooled risk ratio of 1.02 (95% confidence interval 0.92 to 1.14) and an overall I-squared value of 0%. Risk ratios were reported or calculated for one long-term trial, with a risk ratio of 0.94 (95% confidence interval 0.84 to 1.05) and an overall I-squared value of 0%.

Figure 26

Arm pain success (≥20-point improvement on VAS): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = (more...)

Three RCTs (N=792) (in 5 publications)63,71,72,94,95 reported arm pain scores (0-100). Some trials reported arm pain scores in both arms. Conservative estimates (using the smaller mean differences) are reported here. There was no difference in VAS arm pain scores (0-100 scale) between cervical arthroplasty and ACDF in the short term (2 RCTs, N=692, MD −3.72, 95% CI −9.53 to 1.62, I2=0%).72,94 Cervical arthroplasty was associated with a small pain improvement versus ACDF at intermediate term (3 RCTs, N=627, MD −9.95, 95% CI −15.10 to −5.15, I2=0%)63,71,94 but not long term (2 RCTs N=535, MD −5.08, 95% CI −11.73 to 1.70, I2=1.4%)71,95 (Figure 27). One IDE NRSI (N=352) that compared a novel cervical arthroplasty versus ACDF using historical controls reported no differences in mean VAS arm pain intensity at short (1.6 vs. 1.7) or intermediate term (1.8 vs. 1.6).104

Figure 27 is a forest plot. Mean differences were reported or calculated for two short-term trials, with a pooled mean difference of −3.72 (95% confidence interval −9.53 to 1.62) and an overall I-squared value of 0%. Mean differences were reported or calculated for three intermediate-term trials, with a pooled mean difference of −9.95 (95% confidence interval −15.10 to −5.15) and an overall I-squared value of 0%. Mean differences were reported or calculated for two long-term trials, with a pooled mean difference of −5.08 (95% confidence interval −11.73 to 1.70) and an overall I-squared value of 1.4%.

Figure 27

Arm pain scores (0-100): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA = Food (more...)

3.9.3.2.3. Function
3.9.3.2.3.1. Neurologic Function

There was moderate-strength evidence of no difference between cervical arthroplasty and ACDF on neurologic function (SOE: Moderate).

Two IDE RCTs (N=727) (in 5 publications)71,94,95,121,122 that compared cervical arthroplasty with ACDF reported neurologic success (response), defined as maintenance or improvement (compared with preoperative status) in motor function, sensory function, and deep tendon reflexes. In participants with two-level interventions, there was no difference in likelihood of neurologic success between cervical arthroplasty and ACDF at short term (2 RCTs, N=692, 91.0% vs. 87.9%, RR 1.03, 95% CI 0.96 to 1.10, I2= 0%),121,122 intermediate term (2 RCTs, N=604, 91.4% vs. 90.6%, RR 0.99, 95% CI 0.93 to 1.07, I2=12.9%)71,94 or long term (2 RCTs, N=535, 93.2% vs. 84.8%, RR 1.10, 95% CI 1.01 to 1.20, I2=0%; point estimate below the threshold for a small effect)71,95 (Figure 28). The likelihood of neurological success, based on motor, sensory, and myelopathic gait assessments, was similar for cervical arthroplasty and ACDF in one IDE NRSI (N=352, 100% vs. 97.7%).104

Figure 28 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.03 (95% confidence interval 0.96 to 1.10) and an overall I-squared value of 0%. Risk ratios were reported or calculated for two intermediate-term trials, with a pooled risk ratio of 0.99 (95% confidence interval 0.93 to 1.07) and an overall I-squared value of 12.9%. Risk ratios were reported or calculated for two long-term trials, with a pooled risk ratio of 1.10 (95% confidence interval 1.01 to 1.20) and an overall I-squared value of 0%.

Figure 28

Neurologic success: comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA = Food and Drug (more...)

Mean JOA scores (0-17 scale) were similar following cervical arthroplasty and ACDF at short term (6 months, 15.2 vs. 14.9, p>0.05), intermediate term (15.4 vs. 15.3, p>0.05), and long term (81 months, 15.4 vs. 15.2, p>0.05) in one RCT (N=96).99

3.9.3.2.3.2. General Function

There was moderate-strength evidence of no difference between cervical arthroplasty and ACDF on general function (SOE: Moderate).

3.9.3.2.3.2.1. NDI

Two IDE RCTs (N=727) (in 4 publications)71,95,121,122 and one IDE NRSI (N=352)104 that compared cervical arthroplasty with ACDF reported NDI success defined as postoperative NDI score improvement of ≥15 points from baseline. One trial defined NDI success as improvement of ≥30 points from baseline and was not included in the meta-analysis.66 Based on the threshold of ≥15 points from baseline, there were no differences between cervical arthroplasty and ACDF (i.e., although statistically significant, the differences between treatments were below the threshold for a small effect) at short term (2 RCTs, N=692, 89.3% vs. 80.0%, RR 1.12, 95% CI 1.04 to 1.22, I2= 0%),121,122 intermediate term (1 RCT, N=307, 89.2 % vs. 77.9%, RR 1.15, 95% CI 1.03 to 1.27)71 and long term (2 RCTs, N=535, 84.3% vs. 73.6%, RR 1.16, 95% CI 1.04 to 1.30, I2= 0%)71,95 (Figure 29). There was no difference in the likelihood of NDI success between cervical arthroplasty and ACDF in one IDE NRSI (N=352, 92.3% vs. 85.5%, p>0.05).104

Figure 29 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.12 (95% confidence interval 1.04 to 1.22) and an overall I-squared value of 0%. Risk ratios were reported or calculated for one intermediate-term trial, with a risk ratio of 1.15 (95% confidence interval 1.03 to 1.27) and an overall I-squared value of 0%. Risk ratios were reported or calculated for two long-term trials, with a pooled risk ratio of 1.16 (95% confidence interval 1.04 to 1.30) and an overall I-squared value of 0%.

Figure 29

NDI success (≥15-point improvement): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; (more...)

One RCT that defined NDI success as improvement of ≥30 points from baseline found a moderately higher likelihood of NDI success following cervical arthroplasty versus ACDF at intermediate term (1 RCT, N=359, 79.3% vs. 53.4%, RR 1.50, 95% CI 1.21 to 1.86).66

Four RCTs (N=872) (in 6 publications)63,71,72,94,95,99 that compared cervical arthroplasty with ACDF reported NDI scores (0-100, higher score, more limitations). cervical arthroplasty was associated with a small improvement in function based on NDI scores at short (3 RCTs, N=772, MD −5.79, 95% CI −8.44 to −3.21, I2=0%),72,94,99 intermediate (4 RCTs, N=707, MD −7.69, 95% CI −10.30 to −5.10, I2=0%),63,71,94,99 and long term (3 RCTS, N=615, MD −7.63, 95% CI −10.64 to −4.52, I2=0%)71,95,99 (Figure 30).

Figure 30 is a forest plot. Mean differences were reported or calculated for three short-term trials, with a pooled mean difference of −5.79 (95% confidence interval −8.44 to −3.21) and an overall I-squared value of 0%. Mean differences were reported or calculated for four intermediate-term trials, with a pooled mean difference of −7.69 (95% confidence interval −10.30 to −5.10) and an overall I-squared value of 0%. Mean differences were reported or calculated for three long-term trials, with a pooled mean difference of −7.63 (95% confidence interval −10.64 to −4.52) and an overall I-squared value of 0%.

Figure 30

NDI scores (0-100): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA = Food and Drug (more...)

One IDE NRSI (N=352) that compared a novel cervical arthroplasty versus historical ACDF controls found that cervical arthroplasty was associated with a small improvement in function based on the NDI short term (MD 5.7, means 15.1 vs. 20.8, p<0.05); this was not sustained to intermediate term (MD 2.9, means 14.3 vs. 17.2, p>0.05).104

3.9.3.2.3.2.2. SF-36 PCS and MCS

Two IDE RCTs (N=727) (in 3 publications)72,121,122 compared two-level interventions with cervical arthroplasty and ACDF and reported SF-36 PCS and MCS scores (0-100 scale). Success for these component scores was defined as postoperative score improvement of ≥15 points from baseline scores. There was no difference between cervical arthroplasty and ACDF in the likelihood of improved function based on PCS success short term (2 RCTs, N=657, 76.5% vs. 69.3%, RR 1.11, 95% CI 0.88 to 1.46, I2= 72.7%),121,122 intermediate term (2 RCTs, N=639, 83.7% vs. 79.1%. RR 1.06, 95% CI 0.92 to 1.36, I2=69.7%),72,121 and long term (1 RCT, N=216, 76.4% vs. 71.0%, RR 1.08, 95% CI 0.91 vs. 1.27)122 (Figure 31). Similarly, there were no differences between cervical arthroplasty and ACDF in the likelihood of MCS success at short term (2 RCTs, N=657, 50.3% vs. 45.2%, RR 1.08, 95% CI 0.82 to 1.41, I2= 43.9%),121,122 intermediate term (2 RCTs, N=639, 62.3% vs. 65.3%, RR 0.98, 95% CI 0.85 to 1.18, I2=0%),72,121 and long term (1 RCT, N=216, 53.7% vs. 52.7%, RR 1.02, 95% CI 0.79 to 1.31)122 (Figure 32).

Figure 31 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.11 (95% confidence interval 0.88 to 1.46) and an overall I-squared value of 72.7%. Risk ratios were reported or calculated for two intermediate-term trials, with a pooled risk ratio of 1.06 (95% confidence interval 0.92 to 1.36) and an overall I-squared value of 69.7%. Risk ratios were reported or calculated for one long-term trial, with a risk ratio of 1.08 (95% confidence interval 0.91 to 1.27) and an overall I-squared value of 0%.

Figure 31

SF-36 or SF-12 PCS success (≥15-point improvement): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI (more...)

Figure 32 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.08 (95% confidence interval 0.82 to 1.41) and an overall I-squared value of 43.9%. Risk ratios were reported or calculated for two intermediate-term trials, with a pooled risk ratio of 0.98 (95% confidence interval 0.85 to 1.18) and an overall I-squared value of 0%. Risk ratios were reported or calculated for one long-term trial, with a risk ratio of 1.02 (95% confidence interval 0.79 to 1.31) and an overall I-squared value of 0%.

Figure 32

SF-36 or SF-12 MCS success (≥15-point improvement): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI (more...)

Three RCTs (N=792) (in 5 publications)63,71,72,94,95 that compared two-level interventions with cervical arthroplasty and ACDF reported SF-36 PCS and MCS scores (0-100 scale). Differences in mean PCS scores did not meet the threshold for a small improvement and were classified as no difference between cervical arthroplasty versus ACDF at short term (2 RCTs, N=692, MD 3.29, 95% CI 0.63 to 6.19, I2=36.6%),72,94 intermediate term (3 RCTs, N=627, MD 4.80, 95% CI 2.74 to 6.87, I2=0%),63,71,94 and long term (2 RCTs, N=535, MD 2.32, 95% CI −0.03 to 4.71, I2=0%);71,95 however, estimates were imprecise (Figure 33). Two RCTs (N=757) reported mean MCS scores which were also not different between groups at short term (1 RCT, N=380, MD 1.00, 95% CI −1.37 to 3.37),72 intermediate term (2 RCTs, N=665, MD 1.12, 95% CI −1.07 to 3.29, I2=0%),66,72 or long term (1 RCT, N=269, MD 2.90, 95% CI −0.25 to 6.05)95 (Figure 34). One IDE NRSI (N=352) that compared a novel cervical arthroplasty versus matched historical ACDF controls found no difference in mean SF-36 PCS at short (49.2 vs. 46.4, p<0.05) or intermediate term (49.2 vs. 47.9).104

Figure 33 is a forest plot. Mean differences were reported or calculated for two short-term trials, with a pooled mean difference of 3.29 (95% confidence interval 0.63 to 6.19) and an overall I-squared value of 36.6%. Mean differences were reported or calculated for three intermediate-term trials, with a pooled mean difference of 4.80 (95% confidence interval 2.74 to 6.87) and an overall I-squared value of 0%. Mean differences were reported or calculated for two long-term trials, with a pooled mean difference of 2.32 (95% confidence interval −0.03 to 4.71) and an overall I-squared value of 0%.

Figure 33

SF-36 or SF-12 PCS scores (0-100 scale): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; (more...)

Figure 34 is a forest plot. Mean differences were reported or calculated for one short−term trial, with a mean difference of 1.00 (95% confidence interval −1.37 to 3.37) and an overall I-squared value of 0%. Mean differences were reported or calculated for two intermediate-term trials, with a pooled mean difference of 1.12 (95% confidence interval −1.07 to 3.29) and an overall I-squared value of 0%. Mean differences were reported or calculated for one long-term trial, with a mean difference of 2.90 (95% confidence interval −0.25 to 6.05) and an overall I-squared value of 100%.

Figure 34

SF-36 or SF-12 MCS scores (0-100 scale): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; (more...)

3.9.3.2.3.2.3. Odom’s Criteria

There was no difference between cervical arthroplasty and ACDF for the likelihood of scoring excellent or good on Odom’s criteria at intermediate term in one RCT (N=62, 96.7% vs. 84.4%, RR 1.15, 95% CI 0.97 to 1.34).63

3.9.3.2.4. Overall Success (Composite)

The FDA IDE trials were required to report on overall success, a composite outcome that included a threshold of ≥15-point NDI improvement from baseline, improvement or maintenance of neurologic status, no serious adverse events and no additional surgical procedures that might be considered “failure” (e.g., removal, revision, supplemental fixation). Cervical arthroplasty was associated with a slightly higher likelihood of overall success short term (2 RCTs, N=693, 73.2% vs. 62.7%, RR 1.19, 95% CI 1.02 to 1.56, I2=56.2%)94,122 and long term (1 RCT, N=267, 80.4% vs. 62.2%, RR 1.29, 95%CI 1.10 to 1.52).71 At intermediate term, cervical arthroplasty was also associated with slightly greater likelihood of overall success in two RCTs individually (1 RCT, N=297, 60.1% vs. 31.2%, RR 1.95, 95% CI 1.41 to 2.69 and 1 RCT, N=307, RR 1.21, 95% CI 1.05 to 1.40)71,94 (Figure 35).

Figure 35 is a forest plot. Risk ratios were reported or calculated for two short-term trials, with a pooled risk ratio of 1.19 (95% confidence interval 1.02 to 1.55) and an overall I-squared value of 56.2%. Risk ratios were reported or calculated for two intermediate-term trials, with a pooled risk ratio of 1.47 (95% confidence interval 0.87 to 2.69) and an overall I-squared value of 85.8%. Risk ratios were reported or calculated for one long-term trial, with a risk ratio of 1.29 (95% confidence interval 1.10 to 1.52) and an overall I-squared value of 0%.

Figure 35

Overall success (composite): comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; FDA = Food (more...)

One IDE RCT defined overall success with different NDI success criteria (improvement from baseline of ≥30-points if baseline score was ≥60 or ≥50% if baseline score was <60), required adjudication of adverse events and added radiographic success to the criteria listed for the other IDE trials. Cervical arthroplasty was associated with slightly higher likelihood of overall success long-term versus ACDF (1 RCT, N= 249, 60.8% vs. 34.6%, RR 1.76, 95% CI 1.27 to 2.44).95 One IDE NRSI104 that compared a novel cervical arthroplasty versus historical ACDF controls defined overall success as ≥15-point NDI improvement, maintenance or improvement in neurological status), no serious adverse event (any implant-associated or implant/surgical procedure–associated) and no additional index-level surgical procedure. Authors reported that overall success was more common in cervical arthroplasty participants versus ACDF (N=352, 86.7% vs. 77.1, p<0.05) based on multiple imputation modeling (numerators not reported; effect estimate could not be calculated).

3.9.3.2.5. Quality of Life

None of the included studies reported quality-of-life measures.

3.9.3.2.6. Reoperation

There was low-strength evidence that reoperation is substantially less likely with cervical arthroplasty compared with ACDF at all time points from 24 months and beyond (SOE: Low). Rates of reoperation for ACDF at the index level may be influenced by removal of an existing plate to treat ASD, rather than the indication for reoperation being driven by an issue at the index procedure. This may artificially inflate the reported reoperation rate at the index procedure level for ACDR versus cervical arthroplasty. The clinical relevance of removing the plate as a part of a procedure addressing ASD is minimal.

Reoperation included any additional procedure that involved the index level and was substantially less likely with cervical arthroplasty at all times reported across IDE trials, however estimates were imprecise. Effect estimates were consistent across reported times: up to 24 months (2 RCTs, N=727, 2.8% vs. 9.2%, RR 0.28, 95% CI 0.13 to 0.61, I2=0%),65,72 36 to 48 months (1 RCT, N=330, 4.0% vs. 15.2%, RR 0.26, 95% CI 0.12 to 0.57),66 60 months (1 RCT, N=330, 4.7% vs. 18.1%, RR 0.26, 95% CI 0.13 to 0.53),80 and >60 months (2 RCTs, N=727, 4.4% vs. 15.0%, RR 0.29, 95% CI 0.16 to 0.52, I2=0%)71,95 (Figure 36). One IDE NRSI that compared a novel cervical arthroplasty versus historical ACDF controls also reported that secondary surgical interventions were less common with cervical arthroplasty (N=352, 2.2% vs. 8.8%).104

Figure 36 is a forest plot. Risk ratios were reported or calculated for two trials with 24-month follow-up, with a pooled risk ratio of 0.28 (95% confidence interval 0.13 to 0.61) and an overall I-squared value of 0%. Risk ratios were reported or calculated for one trial with 36-to-48-month follow-up, with a risk ratio of 0.26 (95% confidence interval 0.12 to 0.57) and an overall I-squared value of 0%. Risk ratios were reported or calculated for one trial with 60-month follow-up, with a risk ratio of 0.26 (95% confidence interval 0.13 to 0.53) and an overall I-squared value of 0%. Risk ratios were reported or calculated for two trials with more than 60-month follow-up, with a pooled risk ratio of 0.29 (95% confidence interval 0.16 to 0.52) and an overall I-squared value of 0%.

Figure 36

Reoperation at the index level: comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; mos. (more...)

Subsequent surgery rates at adjacent levels were similar between cervical arthroplasty and ACDF at 24 months (2 RCTs, N= 727, 1.6% vs. 3.4%, RR 0.51, 95% CI 0.10 to 1.84, I2=19.8%),72,121 but substantially less common with cervical arthroplasty versus ACDF at 60 months (1 RCT, N=339, 3.4% vs. 11.4%, RR 0.30, 95% CI 0.13 to 0.71)80 and >60 months (2 RCTs, N=642, 6.5% vs. 15.1%, RR 0.46, 95% CI 0.25 to 0.80, I2= 0%).71,95 Across trials, indications for operation at adjacent levels were not consistently described (Figure 37).

Figure 37 is a forest plot. Risk ratios were reported or calculated for two trials with 24-month follow-up, with a pooled risk ratio of 0.51 (95% confidence interval 0.10 to 1.84) and an overall I-squared value of 19.8%. Risk ratios were reported or calculated for one trial with 60-month follow-up, with a risk ratio of 0.30 (95% confidence interval 0.13 to 0.71) and an overall I-squared value of 0%. Risk ratios were reported or calculated for two trials with more than 60-month follow-up, with a pooled risk ratio of 0.46 (95% confidence interval 0.25 to 0.80) and an overall I-squared value of 0%.

Figure 37

Subsequent surgery at adjacent level: comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; (more...)

3.9.3.2.7. Harms

Cervical arthroplasty was associated with a slightly lower likelihood of experiencing any adverse event at 24 months based on low-strength evidence (SOE: Low), but there was no difference between procedures at 120 months for WHO Grade 3 or 4 adverse events (SOE: Low). There was insufficient evidence for neurological deficits or events and for mortality (SOE: Insufficient).

All IDE RCTs and one IDE NRSI provided information on adverse events and harms.

3.9.3.2.7.1. Neurologic Deficit

Two RCTs (N=395) in 3 publications63,66,95 reported neurologic events using varied terminology. One RCT (N=65)63 reported that no neurologic complications occurred with cervical arthroplasty or ACDF through 24 months. There was no difference between neurologic deterioration at 48 months (6.2% vs. 7.6%, RR 0.82, 95% CI 0.35 to 1.89) in one IDE trial66 but a subsequent publication of the trial reported substantially lower incidence of neurological failure, defined as a decrease in sensory, reflex or motor function from preoperative status, with cervical arthroplasty versus ACDF (6.4% vs. 17.1%, RR 0.36, 95% CI 0.19 to 0.70) at 84 months.95

3.9.3.2.7.2. Mortality

Cumulative mortality was similar between two-level cervical arthroplasty (2 deaths) and ACDF (3 deaths) through 120 months in one IDE trial, but authors did not provide cause of death (N=397, 1.0% vs. 1.6%; RR 0.60, 95% CI 0.10 to 3.55);71 there was one death in both groups by 12 months (0.5% vs. 0.5%)72 and two deaths in both groups by 84 months (1.0% vs. 1.1%).83

3.9.3.2.7.3. Serious Adverse Events

Serious adverse events were reported for two IDE trials (N=727) of different devices (five publications)65,66,71,72,83 but were defined differently across reports. One trial’s initial report found events were common and that fewer cervical arthroplasty (Mobi-C) participants experienced one or more serious adverse events (23.9% vs. 32.4%)65 up to 24 months but included events unrelated to the device, surgery, or cervical spine as well as those that may not have required additional medical intervention. In a subsequent report of this trial, following adjudication of events by a clinical events committee, fewer events were considered serious and they continued to be less common with cervical arthroplasty versus ACDF, but effect estimates were imprecise (1 RCT, N=330, 4.0% vs. 7.6%, RR 0.75, 95% CI 0.53 to 1.08) at 24 months.66 The IDE trial of another device (Prestige-LP), also included a broad range of events and reported fewer Grade 3 or 4 adverse events with cervical arthroplasty at 24 months versus ACDF (1 RCT, N=397, 34.4 % vs. 47.9%).72 Cervical arthroplasty was associated with slightly lower likelihood of serious adverse events across the two trials at 24 months (2 RCTs, N=727, 29.3% vs. 42.3%, RR 0.73, 95% CI 0.58 to 0.93, I2=0%)65,72 using the broad definition of events. There was no difference between groups in the frequency of WHO Grade 3 or 4 adverse events at 120 months in one IDE trial (N=397, 66.7% vs. 70.9%, RR 0.93, 95% CI 0.80 to 1.09).71

3.9.3.2.7.4. Device-Related Adverse Events

Device-related adverse event definitions, types of events and adjudication varied across RCTs. One trial included a range of events such as anatomy/technical difficulty, trauma as well as neurological events while others did not provide specifics. Some device-related events may only occur with cervical arthroplasty, others may only occur with ACDF (e.g., nonunion). Some events may not be persistent or serious (e.g., dysphagia or dysphonia). Two-level cervical arthroplasty was associated with a moderately lower likelihood of device-related events at 24 months compared with ACDF (2 RCTs, N=727, 16.6% vs. 25.6%, RR 0.61, 95% CI 0.38 to 1.01, I2=49.1%)65,72 but there was no difference between groups at 120 months in one of these trials (N=397, 26.3% vs. 23.4%, RR 1.12, 95% CI 0.80 to 1.59)71 (Figure 38). When only serious device-related adverse events were considered, as adjudicated by committee or as WHO grade 3 or 4 events, cervical arthroplasty was associated with a substantially lower likelihood of such serious events compared with ACDF at 24 months in one trial (N=397, 1.9% vs. 5.9%, RR 0.33, 95% CI 0.11 to 1.01)72 but there was no difference between groups at 120 months in this same trial (RR 0.48, 3.8% vs. 8.1%, 95% CI 0.21 to 1.11)71 or at 60 months in a second trial (N=330, 4.4% vs. 8.6%, RR 0.52, 95% CI 0.22 to 1.24),94 however, the estimates were very imprecise.

Figure 38 is a forest plot. Risk ratios were reported or calculated for two trials with 24-month follow-up, with a pooled risk ratio of 0.61 (95% confidence interval 0.38 to 1.01) and an overall I-squared value of 49.1%. Risk ratios were reported or calculated for one trial with more than 60-month follow-up, with a risk ratio of 1.12 (95% confidence interval 0.80 to 1.59) and an overall I-squared value of 0%.

Figure 38

Device-related adverse events: comparison of cervical arthroplasty with ACDF (2-level interventions). ACDF = anterior cervical discectomy and fusion; C-ADR = cervical artificial disc replacement (cervical arthroplasty); CI = confidence interval; F/U = (more...)

Device-related adverse events were similar for cervical arthroplasty and ACDF in one IDE NRSI (3.8% vs. 3.5%).104

3.9.3.2.7.5. Dysphagia

Dysphagia was reported by several RCT publications (N=475), but the severity was unclear in most cases.63,94,99 Dysphagia rate ranges were broad for cervical arthroplasty (0% to 24%) and for ACDF (0% to 38%) across these publications. One IDE trial (N=397) reported low rates of Grade 3 or 4 dysphagia that differed slightly across two post-FDA approval study publications, possibly reflecting different analytic methods. Rates did not differ by procedure at 84 months (1.3% vs. 0%)83 or 120 months (0.6% vs. 0.7%).71

3.9.3.2.7.6. Heterotopic Ossification

Grade 3 or 4 HO, considered clinically relevant HO, may be of concern with cervical arthroplasty. Across two IDE RCTs evaluating 2-level interventions (N=337, cervical arthroplasty arms), 35.4 percent of participants developed Grade 3 or 4 HO (29.7% at 60 months in 1 RCT and 42.4% at 84 months in 1 RCT).83,94 One of these trials (N=186, cervical arthroplasty arm)94 reported Grade 4 HO separately which occurred in 9.7 percent of cervical arthroplasty participants by 60 months.94 The FDA IDE NRSI (N=182, cervical arthroplasty arm) also reported HO; at the superior index level, grade 3 and 4 HO occurred in 8 participants (5%) each and at the inferior index level, in 17 (10%) and five (3%) participants, respectively, at 24 months.104 The frequency of Grade 1 or 2 HO was not consistently reported and ranged from 0 to 28.9 percent across three trials (N=278, cervical arthroplasty arms, range 31 to 209) evaluating 2-level interventions.63,72,99

3.9.3.2.8. Differential Effectiveness (HTE)

One IDE trial that compared 2-level cervical arthroplasty and ACDF provided subgroup analysis on the presence of radiculopathy alone (N=287) and myelopathy alone or myelopathy with radiculopathy (N=110) for pain, function. and adverse events at 24 and 84 months but did not formally test for interaction.73 Visual inspection of effect estimates and overlap in estimate variability and subgroup estimates suggest no differential effectiveness or harms, although the study may have been underpowered to evaluate this.

3.9.3.3. Mixed 1-, 2-, or 3-Level Cervical Arthroplasty Versus ACDF

Three RCTs compared 1- 2- or 3-level cervical arthroplasty and ACDF (i.e., mixed levels).62,64,74 Sample sizes ranged from 53 to 83 (total N=196). Across two trials,62,64 54 to 83 percent of participants had single-level procedures, 17 to 37 percent had 2-level procedures, and in one of these trials62 8 percent had 3-level procedures; one trial used the Bryan® disc and the other used the Prestige-II® disc, which are both FDA-approved for single-level indications only. The third trial enrolled participants who underwent 1- or 2-level procedures but did not provide the proportions for each.74 The RCTs were conducted in China, India and Spain. Four additional NRSIs compared harms for mixed-level cervical arthroplasty and ACDF.107,110112

3.9.3.3.1. Fusion

One RCT (N=42) reported intermediate-term fusion success in 90.5 percent of participants in the ACDF arm.62 This RCT also reported fusion in the cervical arthroplasty arm, but this can be attributed to participant crossover after initial randomization.

3.9.3.3.2. Pain

There was low-strength evidence of no difference between treatment with cervical arthroplasty and ACDF on neck pain (SOE: Low).

There was no difference in median VAS (0 to 10) neck pain scores at 60 months between cervical arthroplasty (3.6, interquartile range [IQR] 3.2 to 4.1) and ACDF (median 3.9, IQR 3.0 to 4.4) at 60 months (p=0.203) in one trial (N=50).74 No other pain measures were reported.

3.9.3.3.3. Function
3.9.3.3.3.1. Neurologic Function

There was inadequate evidence to determine the effect of cervical arthroplasty versus ACDF on neurologic function (SOE: Insufficient).

Participants who received cervical arthroplasty had higher mean JOA scores (0-17) at 36 months compared with ACDF in one RCT (N=81; 15.4 vs. 14.7 [estimated from graphs in article]; p=0.016).62

3.9.3.3.3.2. General Function

There was inadequate evidence to determine the effect of cervical arthroplasty versus ACDF on general function (SOE: Insufficient).

One RCT (N=81) reported three different measures of general function at 36 months.62 Participants who received cervical arthroplasty had better (i.e., lower) mean NDI scores (12 vs. 18 [estimated from graphs], on a 0 to 50 scale, p<0.001) and better (i.e., higher) mean SF-36 PCS scores (50.5 vs. 44.5 [estimated from graphs], on a 0 to 100 scale, p<0.05) compared with ACDF, but there were no differences between treatments in the proportion of participants who achieved an excellent (58.5% vs. 58.5%, RR 1.02, 95% CI 0.70 to 1.47) or good (34.1% vs. 25%, RR 1.37, 95% CI 0.69 to 2.71) result according to Odom’s criteria. A second RCT (N=50) reported no difference between groups in NDI scores (median 7, IQR 6 to 8, for both groups) at 60 months.74

3.9.3.3.4. Quality of Life

None of the included studies reported on quality-of-life measures.

3.9.3.3.5. Harms

There was inadequate evidence to determine the effect of cervical arthroplasty and ACDF on harms or adverse events (SOE: Inadequate).

Two RCTs62,64 and four NRSIs107,110112 reported harms and adverse events.

3.9.3.3.5.1. Neurological Complications

One RCT (N=53) reported one case of transient recurrent nerve paralysis in both groups (cervical arthroplasty 4% vs. ACDF 3.6%, RR 1.12, 95% CI 0.07 to 16.98) that resolved within 3-4 weeks and one case of postoperative worsening of arm pain and neurological deficit in the ACDF group (3.6%).64 A second trial (N=83) reported that no intraoperative neurologic complications occurred in either group.62 One large NRSI based on administrative data reported no difference between cervical arthroplasty and ACDF in the frequency of neurological complications (cervical arthroplasty 1.6% vs. ACDF 1.7%, adjusted OR 1.18, 95% CI 0.38 to 3.72), however specific types or timing of neurological events were not reported.107 Another large NRSI (N=1,014) that conducted a propensity score matched analysis reported no differences between treatment arms in the frequency of limb paralysis through 30 days (2.4% vs. 2.4%) and 12 months (8.9% vs. 7.5%); no other details were provided.111 This same study reported spinal complications (0% vs. 0.4% at 30 days; 0% vs. 1.0% at 12 months), neurological complications (0% at 30 days; 0.4% vs. 0.2% at 12 months), and nerve root complications (none at any time), but again no specifics were given.

3.9.3.3.5.2. Mortality

One RCT (N=83) reported that no deaths occurred in either group through 90 months.62 Mortality was rare for both cervical arthroplasty and ACDF across two large NRSIs based on administrative data and there was no difference between procedures: 0.5 and 2.2 percent, respectively, (OR 0.56, 95% CI 0.08 to 4.11) in one NRSI (N=143,060)107 and 0.6 versus 0 percent through 12 months postoperative in the other (N=1,014 after matching).111

3.9.3.3.5.3. Serious Adverse Events

One RCT (N=83) reported one case of DVT (2.4%) in the cervical arthroplasty group.62 There were no differences between cervical arthroplasty and ACDF in the frequency of pulmonary embolism (0.5% vs. 0.8%, OR 1.43, 95% CI 0.19 to 10.7) or DVT (2.2% vs. 2.4%, OR 1.07, 95% CI 0.33 to 3.40) in one large NRSI (N=143,060).107 Similarly, there were no differences between cervical arthroplasty and ACDF in the risk of thromboembolic events across two large NRSIs that performed propensity-score matching (N=1,014 and 1,368): pulmonary embolism at 30 days postoperative (0% vs. 0.2%-0.3%, respectively) in both studies111,112 and through 12 months in one study (1.0% vs. 0.8%)111 and DVT at 30 days postoperative in one study (0% vs. 0.3%).112

One RCT (N=83) reported that no cerebrospinal fluid leakage occurred.62 Cerebrospinal fluid leak was rare for both cervical arthroplasty (0.5%) and ACDF (0.2%) and there was no difference between procedures (OR 2.19, 95% CI 0.29 to 16.3) in one large NRSI based on administrative data.107

In one RCT (N=53), one participant (3.6%) who underwent 2-level ACDF developed a wound hematoma that needed urgent evacuation;64 another RCT reported that there were no cases of wound hematoma.62 One of these trials reported that three ACDF participants (10.7%, N=28) had recurrent cervical pain between 3 and 6 months which required local infiltration (not further explained).64 There were no cases of wound dehiscence at 30 days in one NRSI (N=1,368 after matching)112 and similar frequencies of wound complications for cervical arthroplasty and ACDF through 12 months in a second NRSI (N=1,014 after matching),111 but the severity was unclear.

One case (2.4%, N=41) of heterotopic ossification (i.e., spontaneous fusion/bridging bone) was reported in the cervical arthroplasty group in another RCT.62

Although dysphagia was reported in one RCT62 and two NRSIs,107,111 the severity of dysphagia was unclear. A number of other serious or potentially serious adverse events were reported across the two large NRSIs that conducted propensity score matched analyses (N=2,382). These events were rare and occurred with similar frequency in the cervical arthroplasty and ACDF groups, respectively, through 30 days: cerebrovascular accident (0% vs. 0%-0.6%), sepsis or septic shock (0% vs. 0% to 0.2%), myocardial infarction (0% to 0.1%, both groups), mechanical ventilation (0%; 1 NRSI),112 unplanned intubation (0.3% vs. 0%; 1 NRSI),112 deep infection (0%; 1 NRSI),112 cellulitis (0% vs. 0.2%; 1 NRSI)111 and dural tear (0.2% vs. 0%; 1 NRSI).111 One of these trials reported events through 12 months with more cerebrovascular accidents reported in the ACDF group (0% vs. 2.4%, p<0.001); there were no differences between groups for all other adverse events longer term (dural tear, 0.6% vs. 0%; myocardial infarction, 0.4% vs. 0.6%; sepsis, 0.6% vs. 1.0%; cellulitis, 2.0% vs. 2.2%),

3.9.3.3.6. Reoperation and Subsequent Surgery

One RCT (N=53) reported reoperation at the index level in one (4%) cervical arthroplasty and two (7.1%) ACDF participants between 12 and 36 months (RR 0.56, 95% CI 0.05 to 5.81).64 A second trial (N=83) reported that no participants in either group required reoperation at the index level through 36 months.62 One NRSI did not provide adjusted effect estimates but reported the proportions of cervical arthroplasty and ACDF patients who required reoperation at the index level at 12 months (1.7% vs. 2.4%) and 24 months (0% vs. 3.6%) and subsequent surgery at adjacent levels at 12 months (1.7% vs. 2.4%) and 24 months (3.3% vs. 5.1%).110 Across the two NRSIs that did attempt to control for confounding (propensity score adjusted analyses), over the first 30 postoperative days, 0.4 percent of cervical arthroplasty versus 1.0 percent of ACDF underwent any reoperation (not further specified) in one study (N=1,368)112 and in the second study, 2.8 versus 1.0 percent had a revision surgery, 0.4 versus 0.2 percent had a drainage/evacuation, and no patient had a hardware removal in the other study (N=1,014).111 At 12 months in the latter study, the proportion of patients requiring revision surgery rose to 10.7 versus 7.1 percent; the need for drainage/evacuation (0.8% for both) and hardware removal (0.2% for both) remained low.

3.9.3.3.7. Differential Effectiveness (HTE)

None of the included trials that compared 1-, 2-, or 3 level cervical arthroplasty and ACDF interventions reported differential effectiveness based on patient or other characteristics.

3.10. Key Question 9: In patients undergoing anterior cervical discectomy and fusion, what are the comparative effectiveness and harms of surgery based on interbody graft material or device type?

3.10.1. Standalone Cage Versus Plate and Cage

3.10.1.1. Key Findings

  • There was moderate-strength evidence of no difference in fusion rates between standalone cages versus plate and cage (SOE: Moderate).
  • There was low-strength evidence of no differences between standalone cages versus plate and cage on arm pain, function, and quality of life (SOE: Low); there was inadequate evidence for neck pain (SOE: Insufficient).
  • There was low-strength evidence of no difference between standalone cage versus plate and cage on adjacent-level ossification (SOE: Low); evidence was inadequate for subsidence (sinking of vertebral endplates around the graft) and other adverse events (SOE: Insufficient).

3.10.1.2. Description of Included Studies

Nine RCTs (N=619)124132 compared a standalone device with a traditional plate and cage (Appendix C). The average mean followup duration was 21 months (range immediately postoperative to 36 months). Six trials were conducted in China, two in the United States, and one each in Germany and Japan.

The average study mean age of participants was 52 years (range 41 years to 63 years); the average proportion of females was 42 percent (range 9% to 54%). Few trials reported exact proportions of patients with radiculopathy, myelopathy, or myeloradiculopathy. One trial enrolled only participants with radiculopathy without myelopathy130 and two trials enrolled only participants with myelopathy but did not report the proportion of participants with radiculopathy.127,129 Most trials enrolled participants with 1-level disease,126,128,130 1- to 2-level disease,131,132 or 2-level disease.125 One trial each treated participants with 1- to 3-level disease,124 3-level disease,127 and 2- to 4-level disease.129

All studies were rated moderate risk of bias with the exception of one trial that was rated high risk of bias (Appendix D).126 Methodological limitations included unclear randomization techniques, unclear blinding, and unclear attrition. Evidence for neck pain in standalone devices versus traditional plate and cage was rated insufficient due to conflicting findings. Evidence for harms other than adjacent-level ossification was rated insufficient due to the infrequency of adverse events (Appendix G).

3.10.1.3. Detailed Analysis

3.10.1.3.1. Fusion

There was moderate-strength evidence of no difference in fusion rates between standalone cages versus plate and cage in participants undergoing ACDF (SOE: Moderate).

Almost all participants who underwent ACDF with either a standalone cage or with a traditional plate and cage (N=515) experienced fusion at 12 months (4 RCTs, N=178, 94% vs. 97%, RR 0.99, 95% CI 0.92 to 1.06, I2=0%), 24 months (2 RCTs, N=150, 95% vs. 95%, RR 1.00, 95% CI 0.93 to 1.08, I2=0%) and 36 months (2 RCTs, N=187, 100% vs. 100%, RR 1.00, 95% CI 0.97 to 1.03, I2=0%) (Figure 39). This was true when fusion was limited to one level or involved multilevel fusion. One trial did not report fusion as an outcome.131 (SOE: Moderate)

Figure 39 is a forest plot. Risk ratios were reported or calculated for eight randomized controlled trials comparing effects of standalone cage with traditional plate and cage on fusion rates after ACDF surgery. At 12 months, with 4 randomized controlled trials, the pooled risk ratio was 0.99 (95% confidence interval 0.92 to 1.06) and an I-squared value of 0%. At 24 months, with two randomized controlled trials, the pooled risk ratio was 1.00 (95% confidence interval 0.93 to 1.08) and an I-squared value of 0%. At 36 months, with two randomized controlled trials, the pooled risk ratio was 1.00 (95% CI 0.97 to 1.03), with an I-squared of 0%.

Figure 39

Fusion, standalone cage versus traditional plate and cage. CSLP = cervical spine locking plate; CI = confidence interval; PEEK = polyetheretherketone; PL = profile likelihood; ROI-C = ROI-C implant system; Zero-P = zero-profile

3.10.1.3.2. Pain

There was low-strength evidence of no difference between standalone cages versus plate and cage on arm pain (SOE: Low), with inadequate evidence to determine the benefits and harms of the two approaches on neck pain (SOE: Insufficient).

Four RCTs (N=230) reported changes in overall pain (pain location not specified) or neck pain using a visual analogue scale (VAS: 0-10 or 0-100) across various followup times ranging from less than 3 months to 24 months (Figure 40). Although neck pain was moderately, though not statistically greater at less than 3 months (MD −0.90, 95% CI −1.29 to 0.73) with a plate and cage compared with a standalone cage, the opposite was true at 6 months , MD 0.64, 95% CI −0.66 to 2.17). When pooled analysis was limited to trials of single-level disease, there were no differences in neck pain between standalone cage and plate and cage (Appendix F, Figure F-6).

Four RCTs (N=186) reported changes in arm pain using a visual analogue scale (VAS: 0-10 or 0-100) across various followup times. There were no differences in arm pain after ACDF between use of a standalone cage and a plate and cage at any time point from less than 3 months (MD −0.24, 95% CI −1.55 to 1.12) to 24 months (MD 0.20, 95% CI −0.09 to 0.49) (Figure 41). When analyses were limited to trials of single-level disease, there remained no difference in arm pain between fusion methods (Appendix F, Figure F-7).

Figure 40 is a forest plot. Mean differences were reported or calculated for four randomized controlled trials comparing standalone cage with traditional plate and cage on improvement in neck or unspecified pain after ACDF surgery. At less than three months, with 2 randomized controlled trials, the pooled mean difference was −0.90 (95% confidence interval −1.29 to 0.73) and an I-squared value of 79.6%. At three months, with one randomized controlled trials, the mean difference was 0.20 (95% CI −0.35 to 0.75). At 6 months, with three randomized controlled trials, the pooled mean difference was 0.64 (95% confidence −0.66 to 2.17), with an I-squared of 73%. At 12 months, with three randomized controlled trials, the pooled mean difference was 0.30 (95% CI −0.54 to 1.43), with an I-squared of 64.4%. At 24 months, with one randomized controlled trial, the mean difference was −0.20 (95% confidence interval −0.63 to 0.23) with an I-squared of 0%.

Figure 40

Neck/unspecified pain after ACDF. ACDF = anterior cervical discectomy and fusion; CI = confidence interval; PEEK = polyetheretherketone; PL = profile likelihood; ROI-C = ROI-C implant system; SD = standard deviation; Zero-P = zero-profile

Figure 41 is a forest plot. Mean differences were reported or calculated for four randomized controlled trials comparing standalone cage with traditional plate and cage on improvement in arm pain after ACDF surgery. At less than three months, with 2 randomized controlled trials, the pooled mean difference was −0.24 (95% confidence interval −1.55 to 1.12) and an I-squared value of 0%. At three months, with two randomized controlled trials, the pooled mean difference was 0.06 (95% CI −0.57 to 0.58) with an I-squared of 0%. At 6 months, with four randomized controlled trials, the pooled mean difference was −0.15 (95% confidence −0.56 to 0.14), with an I-squared of 14.6%. At 12 months, with three randomized controlled trials, the pooled mean difference was −0.11 (95% CI −0.55 to 0.29), with an I-squared of 0%. At 24 months, with one randomized controlled trial, the mean difference was −0.20 (95% confidence interval −0.09 to 0.49), with an I-squared of 0%.

Figure 41

Arm pain following ACDF. ACDF = anterior cervical discectomy and fusion; CI = confidence interval; CSLP = cervical spine locking plate; PEEK = polyetheretherketone; PL = profile likelihood; ROI-C = ROI-C implant system; SD = standard deviation; Zero-P (more...)

3.10.1.3.3. Function
3.10.1.3.3.1. Neurologic Function

There was low-strength evidence of no difference between standalone cages versus plate and cage in neurologic function (SOE: Low).

Five RCTs (N=424) reported changes on the JOA (lower score = worse disability, score 0 to 17) after ACDF using a standalone cage or a plate and cage (Figure 42). At less than 3 months, pooled analysis of two trials indicated moderately greater, although not statistically significant, JOA scores with a standalone cage versus a plate and cage (MD 2.63, 95% CI −3.86 to 9.29), this effect is driven by 1 of 2 trials, while the other trial found no effect. At longer followup times, there were no differences between treatments on JOA scores.

Figure 42 is a forest plot. Mean differences were reported or calculated for four randomized controlled trials comparing standalone cage with traditional plate and cage on improvement in JOA scores after ACDF surgery. At less than three months, with two randomized controlled trials, the pooled mean difference was 2.63 (95% confidence interval −3.86 to 9.29) and an I-squared value of 98.1%. At three months, with one randomized controlled trial, the mean difference was 0.00 (95% CI −1.70 to 1.70). At 6 months, with three randomized controlled trials, the pooled mean difference was −0.08 (95% confidence −0.70 to 0.59), with an I-squared of 0%. At 12 months, with three randomized controlled trials, the pooled mean difference was −0.08 (95% CI −0.56 to 0.46), with an I-squared of 0%. At 24 months, with two randomized controlled trials, the pooled mean difference was 0.00 (95% confidence interval −0.69 to 0.69), with an I-squared of 0%. At 36 months, with two randomized controlled trials, the pooled mean difference was −0.13 (95% confidence interval −1.03 to 0.81), with an I-squared of 0%.

Figure 42

JOA scores following ACDF. ACDF = anterior cervical discectomy and fusion; CI = confidence interval; CSLP = cervical spine locking plate; JOA = Japanese Orthopaedic Association Scale; PEEK = polyetheretherketone; PL = profile likelihood; ROI-C = ROI-C (more...)

3.10.1.3.3.2. General Function

There was low-strength evidence of no difference between standalone cages versus plate and cage in general function (SOE: Low).

Six RCTs (N=472) reported changes on the NDI (higher score = worse disability, 0-50 raw score or 0% to 100%) following ACDF with either a standalone cage or a plate and cage (Figure 43). With the exception of less than 3 months timepoint, there were no differences between ACDF with a standalone cage or plate and cage on NDI scores at other timepoints. At less than 3 months, study findings varied and although the pooled estimate slightly favors the standalone cage (MD −5.39, 95% CI −9.91 to 5.19), it is driven by the largest of the three studies and should interpreted with caution.

Figure 43 is a forest plot. Mean differences were reported or calculated for six randomized controlled trials comparing standalone cage with traditional plate and cage on improvement in NDI scores after ACDF surgery. At less than three months, with three randomized controlled trials, the pooled mean difference was −5.39 (95% confidence interval −9.91 to 5.19) and an I-squared value of 73.7%. At three months, with three randomized controlled trials, the pooled mean difference was −0.14 (95% CI −3.14 to 2.16) with an I-squared of 0%. At 6 months, with four randomized controlled trials, the pooled mean difference was −0.08 (95% confidence −3.25 to 4.70), with an I-squared of 53.1%. At 12 months, with four randomized controlled trials, the pooled mean difference was −0.13 (95% CI −2.31 to 1.59), with an I-squared of 0%. At 24 months, with two randomized controlled trials, the pooled mean difference was −0.13 (95% confidence interval −2.41 to 2.04), with an I-squared of 0%. At 36 months, with two randomized controlled trials, the pooled mean difference was 0.15 (95% confidence interval −2.73 to 2.88), with an I-squared of 0%.

Figure 43

NDI scores following ACDF. ACDF = anterior cervical discectomy and fusion; CI = confidence interval; NDI = Neck Disability Index; PEEK = polyetheretherketone; PL = profile likelihood; ROI-C = ROI-C implant system; SD = standard deviation; Zero-P = zero-profile (more...)

Additionally, one trial (N=41) reported no difference at 24 months between a standalone zero-profile device (Zero-P) and a plate and cage on the German version of the Neck Pain Disability Index (25.8% vs. 22.2%, p-value not reported).125

One RCT (N=46) reported no difference between a standalone cage and plate and cage at 24 months on the Odom’s criteria (Excellent: 46% vs. 55%; Good: 54% vs. 45%; Fair: 0% vs. 0%; Bad: 0% vs. 0%),130 while another trial (N=41) reported the mean Odom’s Criteria at 24 months was 3.2 with a standalone cage compared with 3.5 with plate and cage (p-value not reported).125 A third trial (N=115) reported there were no differences between standalone cage versus plate and cage in ratings of “excellent” and “good” overall patient satisfaction (Excellent: 44% vs. 47%, p=0.763; Good: 33% vs. 29%, p=0.835; Fair: 23% vs. 24%, p=0.692; Poor: 0% vs. 0%, p=1.0) at 36 months.124

3.10.1.3.4. Quality of Life

There was low-strength evidence of no difference between standalone cages versus plate and cage in quality of life (SOE: Low).

One RCT (N=40) reported no differences in quality of life as assessed with the Veteran’s RAND 12-Item Health Survey between treatment with a standalone cage versus a plate and cage at 6 weeks and at 12 months, although participants treated with a standalone cage reported better scores at 6 months postoperatively (38.38 vs. 26.27, p=0.033).132

Five RCTs (N=253) assessed swallowing before and after treatment with a standalone cage versus a plate and cage with mixed results.125128,132 Two trials used the Swallowing Quality of Life questionnaire,127,132 two trials rated severity of dysphagia symptoms as “None”, “Mild”, “Moderate”, and “Severe”125,128 and one trial used the Eating Assessment Tool.126 No trial reported differences in dysphagia scores between treatments beyond 3 months postoperatively. One trial reported worse dysphagia scores with plate and cage immediately postoperatively, at 1 month, and at 3 months but no difference at 12 months.128 Another trial reported worse scores with plate and cage at 6 weeks but no differences at 6 and 12 months.132 There were no differences between dysphagia scores at any time from the postoperative period to 12 month in one RCT126 and no differences at 36 months (only time reported) in another trial.127 One trial reported no patient rated dysphagia as “moderate” or “severe” with either treatment125 and no study reported that dysphagia required medical intervention (e.g., return to the operating room, percutaneous endoscopic gastrostomy tube placement).

One RCT (N=54) rated high risk of bias found no differences on the Voice Handicap Index between treatment with a standalone cage versus plate and cage from discharge to 12 months.126

3.10.1.3.5. Harms

There was low-strength evidence of no difference between standalone cage versus plate and cage on adjacent-level ossification (SOE: Low), while evidence for subsidence and other adverse events was inadequate (SOE: Insufficient).

Seven RCTs (N=518) reported adverse events.124,127132 Three trials reported substantially less adjacent-level ossification development with a standalone cage than with plate and cage (N=239, 8% vs. 27%, RR 0.25, 95% CI 0.12 to 0.52, I2=8%). The change in adjacent-level ossification development severity grade (0=no ossification, 3=severe ossification) was reported in one study and favored treatment with the standalone cage (0.208 vs. 0.818, p=0.001).130 (SOE: Low) However, no patient required reoperation at 36 months in two trials;124,127 reoperation rates were not reported in the third trial.130

One RCT (N=46) reported a small, but not statistically significant difference in subsidence (loss of disc height) rates with a standalone cage compared with a plate and cage at 12 months (12.5% vs. 9.1%, RR 1.38, 95% CI 0.25 to 7.48) and at 24 months (16.7% vs. 13.6%, RR 1.22, 95% CI 0.31 to 4.87).130

One trial (N=104) reported few total complications (N=11) in 24 months that included one nerve injury (2%) and no cerebrospinal fluid leaks (0%) with the standalone cage compared with two nerve injuries (4%) and one cerebrospinal fluid leak (2%) with the plate and cage (p=0.999; p=1.00, respectively).129 One trial (N=90) reported one (2%) incidence of loosening of the internally fixed implant with the standalone cage versus three (7%) with plate and cage (p=0.333).131 Another trial (N=40) reported participant treated with a standalone cage experienced a screw loosening, interbody subsidence, and C-5 fracture with revision surgery under consideration at trial publication.132 The same trial also reported one participant treated with a plate and cage experienced screw fracture, pseudarthrosis and underwent posterior fusion and decompression 14 months after the primary surgery.

3.10.2. Titanium Versus PEEK Cages

3.10.2.1. Key Findings

  • There was low-strength of greater likelihood of fusion with a PEEK cage compared with a titanium or titanium-coated PEEK cage (SOE: Low).
  • There was low-strength evidence of greater likelihood of improved general function with a PEEK cage versus a titanium cage (SOE: Low); evidence for neurologic function was inadequate (SOE: Insufficient).
  • Evidence for subsidence and other adverse events was inadequate (SOE: Insufficient).

3.10.2.2. Description of Included Studies

Three RCTs (N=217) compared ACDF using a titanium cage or titanium covered PEEK cage versus a PEEK cage.133135 (Appendix C) The average study mean duration of followup was 45 months (range 12 months to 99.7 months). One study each was conducted in China, Taiwan, and Poland.

The average study mean age of participants was 50 years (range 46 years to 52 years); the average proportion of female participants was 49 and 45 percent, with one trial reporting that 72 percent of 170 disc spaces belonged to women. Two RCTs reported radiculopathy was experienced by 3 and 75 percent, myelopathy by 11 and 57 percent, and myeloradiculopathy by 13 and 40 percent.133,134 The third trial did not report myeloradicular symptoms. One trial enrolled participants with 1-level (66%) or 2-level (34%) disease,134 3-level disease133 or disease at 1 or more levels135

All studies were rated moderate risk of bias (Appendix D). Methodological limitations included unclear randomization techniques, unclear blinding, and lack of intention to treat analysis. No funds were received in one trial133 and funding was not reported in the other two. Evidence for neurologic function was rated insufficient due to limited evidence from one small trial. Evidence for subsidence was rated insufficient due to conflicting findings, while evidence for other harms was insufficient due to few adverse events (Appendix G).

3.10.2.3. Detailed Analysis

3.10.2.3.1. Fusion

There was low-strength evidence of a greater likelihood of fusion with a PEEK cage compared with a titanium or titanium-coated PEEK cage (SOE: Low)

Three RCTs (N=217) reported ACDF fusion rates at different followup times that were not different between titanium and PEEK cages or that favored PEEK cages.

One trial reported that at a mean of 99.7 months (range 86 to 116 months) all participants (N=60) achieved fusion of their 3-level disease with both the titanium cage and with the PEEK cage (87/87 levels vs. 93/93 levels).133 However, followup was not available for 25 percent of the original participants. A second trial (N=53) reported a lower likelihood of fusion with the titanium cage (32/37 levels, 86.5%) versus the PEEK cages (34/34 levels, 100%, p=0.0335) after 24 months.134 The third RCT (N=104) reported a large difference in the likelihood of complete fusion that favored the PEEK cage with complete fusion achieved in 26 of 59 titanium-covered PEEK cages implanted (44.1%) compared with 75 of 85 PEEK cages implanted (88.2%) at 12 months (p<0.001).135 Partial fusion was achieved by 55.9 percent of participants with titanium-covered PEEK cages and 11.76 percent of participants with PEEK cages.135 There were no instances of an absence of fusion.135

3.10.2.3.2. Function
3.10.2.3.2.1. Neurologic Function

There was inadequate evidence of the benefits and harms of PEEK cage versus titanium cage on neurologic function (SOE: Insufficient).

One RCT (N=60) found JOA scores improved from baseline (baseline: 9.6 vs. 9.8) with both a titanium implant and a PEEK implant, but improvement was moderately greater with the PEEK implant (12.8 vs. 14.2, endpoint difference: −1.4, 95% CI −2.33 to −0.47).133

3.10.2.3.2.2. General Function

There was low-strength evidence of improved general function with a PEEK cage compared to a titanium cage (SOE: Low).

The same trial above (N=60) also found moderately improved NDI scores from baseline (baseline: 36.2 vs. 35.4) with both the titanium and the PEEK implant, but improvement was greater with the PEEK implant (21.6 vs. 15.2, endpoint difference: 6.4, 95% CI 5.13 to 7.67).133

Two RCTs (N=113) reported results on Odom’s criteria that favored PEEK cages, although differences were not statistically significant in one trial.133,134 One trial (N=60) reported moderately worse clinical status according to Odom’s criteria with the titanium cage versus the PEEK cage (Excellent: 24% vs. 35%; Good: 31% vs. 39%; Fair: 28% vs. 16%; Bad: 17% vs. 10%, p<0.05).133 One trial (N=53) reported no difference between treatments on clinical status (Excellent: 21% vs. 28%; Good: 54% vs. 52%; Fair: 14% vs. 8%; Poor: 11% vs. 12% or successful treatment: 75% vs. 80%, p=0.6642).134 In the trial where enrollment was limited to individuals with 3-level disease, treatment with the PEEK cage was associated with better clinical status, whereas in the trial of 1- and 2-level disease, there was no differences between cage materials on perceived improvement. Additionally, the followup times were greatly different between trials (99.7 months vs. 24 months) with the longer followup time associated with better ratings.

3.10.2.3.3. Quality of Life

No studies reported quality of life outcomes.

3.10.2.3.4. Harms

Evidence was inadequate to determine the effect of a PEEK cage versus a titanium cage on subsidence or other adverse events (SOE: Insufficient).

One RCT (N=104) found no difference between a titanium-coated PEEK implant and a PEEK implant on the incidence of subsidence in 166 levels (20.6% vs. 21.4%, p=0.875).135 However, subsidence was reported with 34.5% of titanium cages (87 levels) compared with 5.4% of PEEK cages (93 levels) in a second RCT (N=60, p<0.05)133 and 16.2% of 37 levels versus 0% of 34 levels in a third RCT (N=53, p<0.001).134 All three trials defined subsidence similarly (≥ 3 mm of interspace collapse). It is unclear the reason for the difference in study findings; possibilities include the cage materials (a titanium-coated PEEK cage may perform differently than a titanium cage) and the duration since ACDF (12 months in the trial that found no difference versus 24 months and 99.7 months in the other two trials) (SOE: Insufficient).

One RCT (N=53) reported that after 24 months, there were no neurovascular injuries and no revision surgeries with either the titanium cage or the PEEK cage, but that one patient, who received the titanium cage, experienced a hematoma that was removed the day after surgery.134 One RCT (N=60) reported that at a mean of 99.7 months two patients treated with a titanium cage experienced cage dislocation but were asymptomatic.133

3.10.3. Autograft, Allograft, and Other Osteogenic Materials

3.10.3.1. Key Findings

  • There was inadequate evidence to determine comparative benefits (fusion, pain reduction, improved function, improved quality of life) for any osteogenic material versus any other osteogenic material (SOE: Insufficient).
  • There was low-strength evidence that the use of bone morphogenetic protein 2 (BMP-2) in the cervical spine was associated with increased complications compared to no BMP-2 (SOE: Low); evidence was inadequate to determine the comparative harms of other osteogenic materials (SOE: Insufficient).

3.10.3.2. Description of Included Studies

Six RCTs (N=637) compared autologous bone graft, allograft, and/or other materials to support fusion in ACDF (Appendix C).136141 The average mean followup duration was 17 months (range 6 months to 24 months). Two trials were conducted in the United States, two in China, and one each in South Korea and India.

The average study sample size was 106 (range 32 to 319); the average study mean age was 49 years (range 43 years to 55 years). One trial did not report age of participants.139 The mean proportion of females enrolled was 52 percent (range 30% to 66%). The average proportion of patients with radiculopathy was 61 percent (range 28% to 100%), the average proportion of patients with myelopathy was 21 percent (range 0% to 38%), and the average proportion of patients with myeloradiculopathy was 18 percent (range 0% to 34%). One trial reported that all study participants had radiculopathy, myelopathy or both.140 All participants enrolled had 1-level degenerative disease,137,141 1- to 2-level disease136,138,140 or 1- to 3-level disease.139

Additionally, two NRSI (N=944) assessed heterotopic ossification and complications due to neck swelling with the use of BMP-2 compared to anterior cervical fusion without BMP-2.142,143 The mean age in one NRSI was 51 years with 51 percent female and 24 percent of study participants having myelopathy and 1 or more levels fused.143 The other nonrandomized study, which took data from multiple investigational device exemption trials, did not report aggregate baseline patient characteristics but used propensity scoring on 28 predefined demographic and preoperative variables.142

One RCT was rated high risk of bias139 and the remaining RCTs were rated moderate risk of bias (Appendix D). Methodological limitations included unclear randomization methods, unclear blinding, and unclear attrition. Both NRSIs were also rated moderate risk of bias and were downgraded due to baseline differences between study groups on prognostic variables and unclear blinding of outcome assessor. Two trials each reported industry funding, nonprofit funding, and grant funding; one trial did not address funding. One NRSI used data from three Investigational Device Exemption (IDE) trials,142 while the other reported no funds or support from industry.143 Evidence comparing allograft, autograft, and other osteogenic materials on likelihood of fusion, pain improvement, function, and overall harms (with the exception of BMP-2 use) was rated insufficient due to limited evidence for each comparison (Appendix G).

3.10.3.3. Detailed Analysis

3.10.3.3.1. Fusion

There was inadequate evidence to determine the comparative benefits and harms of autograft, allograft, or other osteogenic material versus any other osteogenic material on fusion (SOE: Insufficient).

Six RCTs (N=534) assessed ACDF with autograft, allograft, or other materials (e.g., hydroxyapatite, calcium sulphate) and found no differences between materials in achievement of spinal fusion (Table 3). Fusion rates for all materials were high for all trials but only one randomized study was available for each comparison.

Table 3. Fusion with ACDF using various osteogenic materials.

Table 3

Fusion with ACDF using various osteogenic materials.

3.10.3.3.2. Pain

There was inadequate evidence to determine the comparative benefits and harms of autograft, allograft, or other osteogenic material versus any other osteogenic material on neck or arm pain (SOE: Insufficient).

Five RCTs (N=440) assessed neck and arm pain using a VAS or a numerical (pain) rating scale (Tables 4 and 5). One small trial (N=27) reported a moderately greater decrease in neck pain 12 months after ACDF with a local graft and titanium cage than with allograft and titanium cage (MD −6.15 vs. −5.09, p<0.05).136 Another trial (N=20) found a moderate, though not statistically significant, improvement in neck pain with BMP-2 and allograft ring versus iliac crest bone graft and an allograft ring on a 20-point numerical rating scale (MD 13.0 vs. MD 9.0, p>0.05).140

One trial (N=27) also found a substantially greater decrease in arm pain with local graft and a titanium cage compared with allograft and the same cage (MD −7.24 vs. MD −4.55, p<0.05)136 (Table 5). However, these results should be interpreted with caution due to the trial’s small sample size. One RCT (N=26) reported a substantially greater reduction in arm pain at 24 months with BMP-2 and allograft ring compared with iliac crest bone graft and allograft ring on a 20-point numerical rating scale (−14 vs. −8.5, p<0.03).140 However, as above, these results should be interpreted with caution due to the small sample size. One RCT (N=244) found that ACDF with i-Factor (bone graft made of a peptide bound to an inorganic bone mineral) and an allograft ring was associated with improved VAS arm pain scores at 24 months (1.56 v s. 1.95, p=0.0306) compared with local graft and an allograft ring.137 However, this small difference in scores is below the threshold for a small effect and may not be clinically meaningful. One RCT (N=77) found a small, although not statistically significant, improvement in arm pain at 12 months with hydroxyapatite, demineralized bone matrix and a PEEK cage compared with β-tricalcium phosphate, hydroxyapatite and a PEEK cage (VAS: MD −4.2 vs. MD −3.6, p=0.27).141

There were no differences in neck or arm pain with other comparisons.

Table 4. Neck pain with ACDF using various osteogenic materials.

Table 4

Neck pain with ACDF using various osteogenic materials.

Table 5. Arm pain with ACDF using various osteogenic materials.

Table 5

Arm pain with ACDF using various osteogenic materials.

3.10.3.3.3. Function
3.10.3.3.3.1. Neurologic Function

There was inadequate evidence to determine the comparative benefits and harms of autograft, allograft, or other osteogenic material versus any other osteogenic material on neurologic function (SOE: Insufficient).

Four RCTs (N=436) reported changes in neurological status after ACDF (Table 6). One trial (N=100) found no differences between use of biphasic calcium phosphate ceramic plus a PEEK cage compared with iliac crest bone graft plus a peek cage on JOA score, or JOA recovery rate at 6 months post ACDF.139 One trial (N=66) reported no difference between calcium sulphate plus demineralized bone matrix plus a PEEK cage versus autogenous iliac cancellous bone plus a PEEK cage in JOA scores at 24 months.138 One trial (N=26) reported neurologic success (i.e., maintenance or improvement in sensory and motor function) in all remaining participants at 24 months,140 while another trial (N=244) reported that almost all participants (94.87% vs. 93.70%) experienced neurologic success, also at 24 months.137

Table 6. Neurologic function with ACDF using various osteogenic materials.

Table 6

Neurologic function with ACDF using various osteogenic materials.

3.10.3.3.3.2. General Function

There was inadequate evidence to determine the comparative benefits and harms of autograft, allograft, or other osteogenic material versus any other osteogenic material on general function (SOE: Insufficient).

Four RCTs (N=374) assessed post ACDF neck disability with the NDI (Table 7). One RCT (N=244) found that treatment with i-Factor plus an allograft ring in ACDF resulted in slightly, though not statistically significant, improvement on NDI endpoint scores at 24 months compared with local graft and allograft ring (22.33 vs. 25.66, p=0.5607).137 One small trial (N=26) reported moderately greater improvement on the NDI after 24 months with BMP-2 and allograft ring compared with iliac crest bone graft and allograft ring (52.7 vs. 36.9, p<0.03).140 Another small trial (N=27) reported moderately greater improvement on NDI scores after 12 months with local graft plus a titanium cage versus allograft plus titanium cage (MD 56.5 vs. MD 41.4, p<0.05).136 There was no difference in improvement in NDI scores with hydroxyapatite/demineralize bone matrix plus PEEK cage versus β-tricalcium phosphate/hydroxyapatite plus PEEK cage at 12 months.141

Three RCTs (N=357) assessed general function using the SF-36 or the 2-item SF-12 (Table 7). Two trials found no difference in function on the SF-36 after ACDF using an allograft ring with either i-Factor or local graft137 or using an allograft with either BMP-2 or an iliac crest bone graft.140 One small trial (N=27) reported moderately better function at 12 months using the 2-item SF-12 with local graft plus a titanium cage compared with the same cage and allograft infused with the participant’s blood (MD 48.7 vs. 65.9, p<0.05).136 However, care should be used in interpreting these results due to the small study sample size.

Table 7. General function with ACDF using various osteogenic materials.

Table 7

General function with ACDF using various osteogenic materials.

3.10.3.3.4. Harms

There was low-strength evidence that the use of BMP-2 in cervical spine fusion is associated with increased complications compared to the use of no BMP-2 (SOE: Low), while evidence was inadequate to determine the comparative harms of other osteogenic materials (SOE: Insufficient).

Four RCTs (N=520) and 2 NRSI studies (N=944) reported harms with ACDF using various graft materials (Table 8). There were few differences between treatments reported in the RCTs in the likelihood of various harms. One trial (N=319) reported a moderately greater likelihood of experiencing a new radiculopathy with an allograft ring with local graft than with i-Factor (13.66% vs. 25.00%, p=0.0142) but there were no differences in new intractable neck pain or progression of neuropathy.137 One trial (N=100) reported a shorter hospital stay with a biphasic calcium phosphate ceramic combined with a PEEK cage compared with a PEEK cage with iliac crest bone graft.139 Reasons for the difference in hospital stay were not provided.

Two retrospective NRSI of BMP-2 compared with no BMP-2 in ACDF (N=944) reported a greater likelihood of heterotopic ossification (78.6% vs. 59.2%, p<0.001)142 and complications associated with neck swelling143 with the use of BMP-2 (Table 8). In one NRSI, participants were 10 times more likely to have a neck swelling complication if BMP-2 was used in anterior cervical fusion, even after controlling for potential confounding variables (e.g., age, gender, presence of myelopathy, levels fused, smoking).143

Table 8. Adverse events with ACDF using various graft materials.

Table 8

Adverse events with ACDF using various graft materials.

3.11. Key Question 10: In patients with pseudarthrosis after prior anterior cervical fusion surgery, what are the comparative effectiveness and harms of posterior approaches compared to revision anterior arthrodesis?

No studies met eligibility criteria for Key Question 10.

3.12. Key Question 11: In patients with cervical spondylotic myelopathy, what is the prognostic utility of preoperative magnetic resonance imaging (MRI) findings for neurologic recovery after surgery?

3.12.1. Key Findings

  • There was low-strength evidence that multisegmental T2-weighted-increased signal intensity (ISI) and sharp T2-weighted-ISI on preoperative MRI was associated with poorer outcomes (SOE: Low).
  • There was low-strength evidence that increased signal intensity ratio (SIR) was associated with poorer neurologic recovery (SOE: Low).
  • Evidence for other MRI findings was inadequate (SOE: Insufficient).

3.12.2. Description of Included Studies

MRI of the cervical spine is a common imaging procedure performed prior to cervical spine surgery. To identify whether MRI findings can predict neurologic recovery after surgery, we identified one relevant systematic review144 (that included 22 studies) and 17 additional studies145163 that were not included in the systematic review or published subsequent to the review’s search dates that provided evidence for this question (Appendix C). Studies were conducted in the United States, China, Taiwan, United Kingdom, Spain, Italy, Greece, India, Korea, and Japan. Most studies were small, with sample sizes ranging from 19 to 861 (mean 162) participants. Mean age of participants ranged from 47 to 70 years (overall mean: 53.9 years), and the proportion of females ranged from 7 to 50 percent (mean 38%). The systematic review and 14 of the 17 primary studies were rated moderate risk of bias, with 3 studies rated high risk of bias (Appendix D). Evidence was insufficient for MRI findings other than ISI and SIR due to limited available data for other outcomes (Appendix G).

3.12.3. Detailed Analysis

3.12.3.1. Fusion

No studies reported fusion outcomes.

3.12.3.2. Pain

No studies reported pain outcomes.

3.12.3.3. Function

3.12.3.3.1. Systematic Review Evidence

A 2013 systematic review that assessed the prognostic utility of preoperative MRI for neurologic recovery after surgery included 22 studies (N=1,508).144 The included studies evaluated preoperative MRI in patients undergoing cervical disc surgery using a posterior approach (k=7), ACDF (k=5), mixed approaches (k=9), or an unspecified procedures (k=1) over followup ranging from 1.5 to 60.6 months (mean 27.8; standard deviation 4.6 months). The majority of patients in the included studies were male (mean proportion of females: 27.1%), and the mean age (from 20 studies reporting age) was 57.4 (standard deviation, 1.0) years. Heterogeneity of study designs, methods, and outcomes (JOA in 17 studies, Nurick grade in 5 studies, NDI in one study, and Neurosurgical Cervical Spine Score in one study) of the included studies precluded pooling of study findings, and the mixed results were reported narratively. Presence of multisegmental T2-weighted increased signal intensity (ISI) was associated with worse functional outcomes in five studies, not associated with outcomes in four other studies, and lack of T2-weighted ISI was associated with better outcomes in three studies; qualitative classification of T2-weighted ISI was associated with poorer functional status in six studies, not associated with functional outcomes in one study, and lack of T2-weighted ISI associated with better outcomes in one study. Snake-eye appearance on axial T2-weighted MRI, ISI in gray and white matter, and increased SIR were associated with poorer surgical outcomes in one study each.

3.12.3.3.2. Primary Study Evidence

We identified four relevant studies (N=326) that were not included in the systematic review,156,157,159,160 as well as 13 studies (in 15 publications) that were published subsequent to the review search dates.145155,158,161163 Of these studies, two assessed presence of segmental abnormalities (endplate abnormalities, modic changes, and Cobb angle/loss of lordosis),146,147,152 six assessed qualitative differences in ISI intensity,145,149151,154,157 three assessed SIR,148,153,155 one evaluated presence or absence of signal changes,159 two evaluated diffusion tensor tractography grading,158,162 one (in 2 publications) evaluated diffusion-based spectrum imaging,161,162 one evaluated a radiomic-based extra tree model,163 one evaluated the size of the transverse area at the compression site,160 and one evaluated size, extent, and qualitative intensity.151 The study (N=55) that assessed the size of the transverse area reported significant associations with postoperative JOA scores (r=0.298) and with JOA recovery (r=0.295) (both p<0.05).160 The study (N=56) that evaluated size, extent, and intensity of ISI reported no association of size or extent of ISI with functional outcomes;151 one other study of qualitative imaging signal intensity also reported no association of intensity changes with recovery (mJOA score ≥16, RR 1.71; 95% CI 0.90 to 3.24),145 while four studies (N=714) did find qualitative intensity associated with reduced recovery ratio, lower likelihood of optimal surgical outcome, or change in JOA or NDI scores.149151,154 One study (N=52) reported improved JOA recovery rate (54.3% vs. 27.3%) in patients without ISI compared to those with ISI.156 Another study (N=146) that assessed presence or absence of imaging signal changes reported that patients without imaging signal changes were more likely to have improvement in Nurick grade (OR 5.1; 95% CI, 1.87 to 25.1); however, there was no difference between patients without imaging signal changes and those with only T2-weighted signal changes.159 Another study (N=73) found that the combination of T1-weighted hypointensity and T2-weighted hyperintensity was associated with poorer JOA recovery than T2-weighted hyperintensity alone or no ISI changes (JOA recovery 48% vs. 19% vs. 60.7%; T1- and T2-weighted ISI changes vs. T2-weighted ISI change only, p=0.0259).157 Two studies of SIR (N=220) reported increased T2-weighted SIR associated with JOA recovery;148,155 one study (N=148)153 reported no association between T2-weighted SIR and outcomes, while lower T1-weighted SIR was associated with poorer neurological outcomes assessed with the JOA. One study (N=129)158 found that diffusion tensor tractography grading using MRI images was associated with JOA score changes (r= −0.813, p<0.001) and JOA recovery (r= −0.429, p<0.001), while conventional MRI ISI grading was associated with JOA score changes (r= −0.674, p<0.001) but not with JOA recovery (r= −0.197, p=0.058). However, another study (N=42) comparing diffusion-based spectrum imaging to diffusion tensor grading reported that no diffusion tensor metrics were associated with neurological (mJOA) or general function (SF-36, NDI, and Myelopathy Disability Index) outcomes.162 The study found that preoperative diffusion-based spectrum imaging intra-axonal axial diffusivity and anisotropic fraction correlated with improved mJOA scores (r=0.37, p=0.02 and r=0.34, p=0.03, respectively).162 Another analysis of most of these same patients (N=50)161 compared diffusion-based spectrum imaging to clinical features and found greater prognostic utility with diffusion-based spectrum imaging (area under the curve [AUC] 75.3%) and the combination of diffusion-based spectrum imaging with clinical features (AUC 98.0%) than with assessment of clinical features alone (AUC 59.4%) for mJOA scores. The study reported similar findings for the prognostic utility of diffusion-based spectrum imaging (AUC 54.6%) or the combination of diffusion-based spectrum imaging and clinical features (AUC 65.3%) versus clinical features alone (AUC 48.8%) for NDI.

One study (N=151) evaluated a novel radiomic-based extra tree model of MRI data for predicting neurological outcomes following surgery for CSM.163 The study reported that their radiomic-based model (AUC 75%) and the combination of their radiomic-based model with clinical assessment (AUC 71%) were superior to radiological assessment (AUC 43%) and the combination of radiological and clinical assessment (AUC 40%) for predicting neurologic recovery assessed using mJOA.

One study (N=121) reported a novel classification system for reporting loss of cervical lordosis following laminoplasty was predicted by an interplay of preoperative Cobb angle, T1 slope, and dynamic extension reserve.152 One study (N=861) reported Modic changes, defined as “subchondral vertebral bone marrow lesions of the endplate” on preoperative MRI and found that while modic changes were associated with greater postoperative disability, modic changes were also associated with older age, greater number of levels fused, and a longer duration of symptoms.146

Comparing findings across studies was difficult due to the various study methods used (e.g., different type and basis of classification of T2 weighted ISI [single segment, multisegment, L2 classification, Q3 classification, SIR], different outcomes assessed [JOA, NDI, Nurick grade], and different methods to analyze the data [correlation, linear regression, multivariable regression, Student’s t test]). Preoperative MRI also preceded different types of surgery (e.g., ACDF, laminoplasty, posterior-anterior decompression), which reduces the generalizability of findings.

3.12.3.3.3. Synthesis of Systematic Review and Primary Study Findings

There was low-strength evidence that multisegmental T2-weighted-increase signal intensity and sharp T2-weighted-increased signal intensity on preoperative MRI was associated with poorer neurologic outcomes (SOE: Low); there was also low-strength evidence that increased SIR of preoperative MRI was associated with poorer neurologic recovery (SOE: Low)

In total, presence of ISI was associated with poorer neurologic outcomes (e.g., JOA recovery, Nurick grade, NDI) in 7 studies and absence of ISI was associated with better neurologic outcomes (e.g., JOA, Nurick grade) in 4 studies but was not associated with changes in neurologic outcomes in 5 studies. Qualitative grading (increased intensity) of ISI was associated with worse neurologic outcomes (e.g., JOA, NDI) in 11 studies, absence of T2-weighted intensity associated with a better neurologic outcome (Nurick grade) in 1 study, and not associated with neurologic outcomes in 3 studies. Higher SIR was associated with poorer recovery in 3 studies (AUCs ranged from 78.6% to 87.3% in the two studies that reported accuracy results); one study reported lower SIR on T1 weighted associated with poorer neurological outcomes (e.g., JOA), while T2-weighted SIR was not associated with outcomes. One study reported that diffusion tensor tractography grading was more closely associated with neurological outcomes and recovery (e.g., JOA) than conventional ISI grading; however, another study found no association of diffusion tensor grading with neurological outcomes. One study of diffusion-based spectrum imaging found the imaging modality superior to diffusion tensor grading and assessment using clinical features. One study found a novel radiomic-based extra tree model to be superior to both radiological and clinical assessment (SOE for ISI and SIR: Low).

3.12.3.4. Quality of Life

No studies reported quality of life outcomes.

3.12.3.5. Harms

No studies reported harms or adverse events.

3.13. Key Question 12: What are the sensitivity and specificity of imaging assessment for identifying symptomatic pseudarthrosis after prior cervical fusion surgery?

3.13.1. Key Findings

  • There is low-strength evidence that postoperative ACDF dynamic radiographs can predict pseudarthrosis in a largely asymptomatic population (SOE: Low) and a largely symptomatic population (SOE: Low).
  • Evidence was inadequate for use of an angular method measurement in postoperative ACDF dynamic radiographs in predicting pseudarthrosis in an undefined population (SOE: Insufficient).

3.13.2. Description of Included Studies

Three nonrandomized studies (N=758)164166 assessed diagnostic accuracy of radiographs in predicting pseudarthrosis after prior cervical fusion surgery (Appendix C). All studies were conducted in the United States. The mean ages of participants was 52 years; the proportion of females ranged from 42 to 62 percent. No studies reported race or ethnicity. In all studies, enrolled patients had undergone ACDF as the index surgery, and revision surgery included anterior or posterior approaches.

Two studies were rated moderate risk of bias164,165 and one high risk of bias166 (Appendix D). Methodological limitations included lack of clarity on the number and characteristics of patients missing imaging studies; high attrition and lack of clarity on reference standard accuracy and assessor blinding. No studies reported receiving funding. Evidence for a novel measurement method in predicting pseudarthrosis was rated insufficient due to the small sample size, study quality, and reference standard (Appendix G).

3.13.3. Detailed Analysis

There is low-strength evidence that postoperative ACDF dynamic radiographs can predict pseudarthrosis in a largely asymptomatic and a largely symptomatic population (SOE: Low), while evidence was inadequate to determine the comparative accuracy of using angular versus linear measurement methods in postoperative dynamic radiographs for predicting pseudarthrosis (SOE: Insufficient).

One study (N=125) reported diagnostic accuracy of dynamic radiographs and computed tomography (CT) scans for identifying pseudarthrosis in patients who had undergone revision surgery for pseudarthrosis or adjacent segment pathology, using surgical exploration of fusion as the reference standard.165 Medical records were retrospectively reviewed for patients operated on from January 2004 through December 2011. There were 262 levels evaluated (109 fused and 153 with pseudarthrosis). Most patients (84%) had revision surgery due to suspected pseudarthrosis, although it is unclear if patients were symptomatic. In dynamic radiographs magnified 150 percent, the optimal cutoff in interspinous motion to predict pseudarthrosis was 0.9 mm (AUC 0.899). Using cutoff criteria of interspinous motion ≥1 mm and superadjacent interspinous motion ≥4 mm resulted in similar values for diagnostic accuracy in dynamic radiographs versus a CT scan: sensitivity (86.3% vs. 87.2%), specificity (96.1% vs. 97.4%), positive predictive value (96.9% vs. 97.9%) and negative predictive value (83.4% vs. 84.4%) (SOE: Low).

One study (N=597, levels=1,203) assessed diagnostic accuracy of dynamic radiographs for predicting symptomatic pseudarthrosis in patients who were largely asymptomatic but required revision surgery.164 Medical records from 2010 to 2019 were reviewed for eligible patients. The reference standard was intraoperative documentation of pseudarthrosis (36% of the patient sample); only 4.9 percent of patients required pseudarthrosis revision.164 Pseudarthrosis rates increased as the number of operative levels increased from 22.2 percent with 1-level to 75 percent with 4-level surgery. In radiographs taken 1 year post-primary surgery, using an optimal cutoff of 1 mm interspinous motion (AUC 0.868) had high negative predictive value (99.6%) and sensitivity (89.7%); moderate specificity (81%); and low positive predictive value (13.7%) in identifying patients requiring revision surgery due to pseudarthrosis. Adding superadjacent interspinous motion ≥4 mm to 1 mm interspinous motion to the model, versus 1 mm alone,165 reduced the number of patients and levels included in the authors’ analysis but resulted in similar AUC. The positive predictive value was also decreased without improving the negative predictive value (SOE: Low).

One study rated high risk of bias (N=143 enrolled; 36 analyzed) validated an angular measurement method for predicting pseudarthrosis in patients with 10 months’ minimum postoperative radiographic followup.166 Medical records were retrospectively reviewed for eligible patients (years not reported); 1-year postoperative CTs (n=36) were used as the reference standard. Authors did not report whether patients were symptomatic or asymptomatic at the time of imaging. In dynamic radiographs at 150 percent magnification, the angle measurement method was calculated as the difference in angles between lines from specific landmarks in the spinous processes, while the standard linear method calculated differences in interspinous process distance between flexion and extension radiographs. Using 1 mm linear measurement cutoffs as reported in prior studies, suspected pseudarthrosis rates were lower using angular versus linear methods (N=143; 18.5% [45/242 levels] vs. 28% [68/242 levels], p=not reported).166 In 1-year validation CTs (n=36; 66 levels), pseudarthrosis was identified in 13 patients (13 levels), of whom 5 underwent revision surgery; use of the angle method resulted in similar sensitivity (85%) but higher specificity (96%) versus the linear method (85% and 87%, respectively).166 (SOE: Insufficient)

3.14. Key Question 13: In patients with cervical spondylotic myelopathy, what are the comparative effectiveness and harms of intraoperative neuromonitoring (e.g., with somatosensory or motor evoked potential measurements) versus no neuromonitoring on clinical outcomes in patients undergoing surgery?

3.14.1. Key Findings

  • There was low-strength evidence of a similar likelihood of neurological complications with or without the use of intraoperative neuromonitoring (IONM) in ACDF for cervical myelopathy and radiculopathy (SOE: Low). This evidence only applies to patients undergoing ACDF and only one study reported the proportion of patients with myelopathy.

3.14.2. Description of Included Studies

Two retrospective NRSIs utilized large US claims databases (National Inpatient Sample [NIS]) of the Healthcare Cost and Utilization Project from 2009 to 2013 (N=141,007)167 and PearlDiver from 2007 to 2014 (N=15,395)168 to examine the effects of IONM versus no IONM in patients undergoing ACDF.

In the NIS study, 1:1 propensity score-matching, controlling for age, sex, indication, number of levels fused, Charlson Comorbidity Index (CCI) and admission type (elective, nonelective) was used (N=18,760).167 There was no adjustment for confounders in the PearlDiver study.168 The NIS data included inpatient data with no outpatient followup; the PearlDiver data included followup out to 30 days postoperatively. All data were collected from claims in the United States.

The mean age of participants was 54 years in the NIS study and reported by categories in the PearlDiver study (<45 years, 45-54, 55-64, 65-74, and >75; with the largest number of patients in the 45-54 age category). The average proportion of females was 51 and 52 percent, respectively. The NIS study enrolled a majority of White participants (80%), while the PearlDiver study did not report race/ethnicity (Appendix C).

Of patients with degenerative disease in the entire NIS, 42 percent of participants had radiculopathy alone and 31 percent had myelopathy (these proportions were not reported in the propensity score-matched NIS). Additionally, 66 percent of participants in the NIS study had a CCI of 0 (3.4% with a CCI of 3 or higher) and 84 percent had 1-2 level fusion, whereas the PearlDiver study did not report proportions with baseline radiculopathy, myelopathy, comorbidities, or levels fused.

The NIS study was rated moderate risk of bias due to study design.167 The PearlDiver study was rated high risk of bias due to study design and lack of adjustment for potential confounders168 (Appendix D). Concerns with these studies include the use of International Classification of Diseases codes to determine utilization, reliance on data from paid or adjusted claims rather than all claims, and changes in medical coverage policies.

3.14.3. Detailed Analysis

3.14.3.1. Outcomes

No studies reported fusion outcomes, pain, function, or quality of life.

3.14.3.2. Harms

There was low-strength evidence of a similar likelihood of neurological complications with or without the use of intraoperative neuromonitoring in ACDF (SOE: Low).

The NIS study included 18,760 patients who underwent ACDF in the propensity score-matched analyses from 2009 to 2013 and found no differences between IONM and no IONM in the rate of neurological complications (0.22% vs. 0.17%, p=0.41) or in the proportion of patients who required a hospital stay greater than 2 days (17.8% vs. 18.6%, p=0.15).167

The PearlDiver database study included 15,395 patients who underwent ACDF from 2007 to 2014 for degenerative radiculopathy or myelopathy (IONM was used for 17.1% of patients, N=2627).168 Although there was no propensity score matching or adjustments made for confounding variables, the results were similar to the NIS study. There was no difference in rate of neurologic complication within 30 days of the index procedure between IONM and no IONM (0.23% vs. 0.27%, p=0.84). However, younger patients were more likely to receive IONM (20.3% in patients less than 45 years of age compared to 13.6% in patients >75 years).

3.15. Contextual Question 1: What is the prevalence of cervical degenerative disease with spinal cord compression in asymptomatic patients?

Not all individuals with CDD that includes spinal cord compression experience pain, radiculopathy, myelopathy or other symptoms. A 2021 systematic review and meta-analysis rated moderate risk of bias included 11 studies (N=3,686) that reported cervical MRI results in healthy individuals.169 In pooled analysis, the prevalence of asymptomatic spinal cord compression was 24.2 percent (range 5.3% to 59%; 95% CI 12.4% to 36%, I2=88).

To help explain the high statistical heterogeneity in pooled analysis, studies of asymptomatic participants were stratified based on mean age (less than or equal to 60 years versus greater than 60 years). The prevalence of spinal cord compression was lower in the younger subgroup (7 studies, N=1841, prevalence 7.4%, 95% CI 2.8% to 12%, I2=40%) versus the older subgroup (4 studies, N=1845, prevalence 35.3%, 95% CI 14.1% to 56.5%, I2=94%). Studies were also stratified based on study location: America/Europe (6 studies, N=390, prevalence of spinal cord compression 39.7%, 95% CI 21.0% to 58.3%, I2=64%) versus Asia (5 studies, N=3296, prevalence of spinal cord compression 11.1%, 95% CI 1.6% to 20.5%, I2=83%). The study with the largest number of participants (N=1211) was conducted in Japan, enrolled younger participants (mean age 50 years) and reported the lowest prevalence of spinal cord compression (5.3%).170 In this study, spinal cord compression was defined as when “the AP (anteroposterior) diameter of the spinal canal at its narrowest was less than or equal to the AP diameter of the spinal cord at the C5 vertebral level.”170 This is in contrast to the study with the highest prevalence of participants with spinal cord compression (59%, N=183) that enrolled older participants (mean 66 years) and was conducted in the Czech Republic.171 The definition of spinal cord compression in this study was more liberal and was diagnosed when “a change in spinal cord contour at the level of an intervertebral disc on axial or sagittal MRI compared with that at the midpoint level of neighboring vertebrae.”171 In both studies, as expected, the prevalence of spinal cord compression increased with age.

3.16. Contextual Question 2: What is the natural history of untreated spinal cord compression in patients with cervical degenerative disease?

The natural history of degeneration of the cervical spine progressing to nonmyelopathic spinal cord compression (NMSCC) and ultimately CSM is a continuum of disease that remains poorly understood. Untreated spinal cord compression is most studied in the context of CSM. There is a subset of patients with spinal cord compression found on imaging who are asymptomatic. A recent systematic review by Nouri et al (2022)172 found the prevalence of asymptomatic spinal cord compression in healthy volunteers to be 24.2 percent (range 5.3% to 59%). A small series by Martin et al (2018)173 looking at 20 asymptomatic patients with MRI evidence of spinal cord compression revealed that 2 (10%) developed symptoms of myelopathy at a median followup of 21 months. The largest prospective study evaluating the transition from NMSCC to CSM by Bednarik et al (2008) revealed that among 199 patients enrolled with NMSCC, 8 percent developed CSM at 1-year followup and 22.6 percent of patients developed CSM at median followup of 44 months (range 1-12 years).174 Factors found to independently predict the development of myelopathy in a multivariate analysis included presence of radiculopathy, spinal cord cross-sectional area and compression ratio.175

CSM is the leading cause of spinal cord dysfunction among adults worldwide.176 The pathogenesis of CSM is due to both mechanical and neuropathic changes to the spinal cord and blood spinal cord barrier generated by compression on the spinal cord.177180 The compressed cervical spinal cord is subjected to chronic hypoxic conditions due to dysfunction of endothelial cells as well as flattening and consequent loss of surrounding vessels.178

While the natural history of CSM in patients varies greatly, it is generally thought of as a progressive disorder. This was confirmed in a recent systematic review181 that found moderate evidence from small prospective and retrospective studies that the proportion of patients who deteriorate by at least 1 point in the JOA scale ranged from 20 to 60 percent. It is important to point out that these studies did not consider the minimal detectible difference to define deterioration, which is >1 point based on reliability studies.182,183 The overall lack of large, well designed and controlled studies evaluating the natural history of untreated spinal cord compression in patients with CDDs impairs clinicians’ ability to counsel patients. A recent clinical practice guideline provided by AO Spine suggested that either surgery or clinical observation are reasonable initial treatment options in mild CSM (e.g., mJOA score greater than or equal to 15).184,185

Shimomura et al186 evaluated prognostic factors for deterioration of patients with CSM treated nonoperatively. Their prospective study included 56 patients with mild CSM, 11 (20%) had clinical deterioration over a mean followup period of 35.6 months. Age, gender, followup period, developmental or dynamic canal factors (e.g., canal size of < 12mm) of cervical spine on plane lateral radiographs, presence of high intensity of the cord on T2 weighted MRI and circumferential spinal cord compression on axial MRI were all evaluated as possible predictors for progression of myelopathy. However, they found the only predictive factor was presence of circumferential spinal cord compression on axial MRI (adjusted OR 26.6, 95% CI 1.7 to 421.5).186 More studies are needed to better define the natural history of untreated spinal cord compression in the setting of degenerative changes along with predictors of progression.

Image appff1
Image appff2
Image appff3
Image appff4
Image appff5
Image appff6
Image appff7

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (12M)

In this Page

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...