Validation of extracorporeal membrane oxygenation mortality prediction and severity of illness scores in an international COVID-19 cohort.
Shah N., Xue B., Xu Z., Yang H., Marwali E., Dalton H., Payne PPR., Lu C., Said AS., ISARIC Clinical Characterisation Group None.
BACKGROUND: Veno-venous extracorporeal membrane oxygenation (V-V ECMO) is a lifesaving support modality for severe respiratory failure, but its resource-intensive nature led to significant controversy surrounding its use during the COVID-19 pandemic. We report the performance of several ECMO mortality prediction and severity of illness scores at discriminating survival in a large COVID-19 V-V ECMO cohort. METHODS: We validated ECMOnet, PRESET (PREdiction of Survival on ECMO Therapy-Score), Roch, SOFA (Sequential Organ Failure Assessment), APACHE II (acute physiology and chronic health evaluation), 4C (Coronavirus Clinical Characterisation Consortium), and CURB-65 (Confusion, Urea nitrogen, Respiratory Rate, Blood Pressure, age >65 years) scores on the ISARIC (International Severe Acute Respiratory and emerging Infection Consortium) database. We report discrimination via Area Under the Receiver Operative Curve (AUROC) and Area under the Precision Recall Curve (AURPC) and calibration via Brier score. RESULTS: We included 1147 patients and scores were calculated on patients with sufficient variables. ECMO mortality scores had AUROC (0.58-0.62), AUPRC (0.62-0.74), and Brier score (0.286-0.303). Roch score had the highest accuracy (AUROC 0.62), precision (AUPRC 0.74) yet worst calibration (Brier score of 0.3) despite being calculated on the fewest patients (144). Severity of illness scores had AUROC (0.52-0.57), AURPC (0.59-0.64), and Brier Score (0.265-0.471). APACHE II had the highest accuracy (AUROC 0.58), precision (AUPRC 0.64), and best calibration (Brier score 0.26). CONCLUSION: Within a large international multicenter COVID-19 cohort, the evaluated ECMO mortality prediction and severity of illness scores demonstrated inconsistent discrimination and calibration highlighting the need for better clinically applicable decision support tools.