- 2022;25;593-602A Machine Learning Approach to Identify Predictors of Severe COVID-19 Outcome in Patients With Rheumatoid Arthritis
Sara M. Burns, MS, TIffany S. Woodworth, MPH, Zeynep Icten, PhD, Trenton Honda, PhD, and Justin Manjourides, PhD.
BACKGROUND: Rheumatoid arthritis (RA) patients have a lowered immune response to infection, potentially due to the use of corticosteroids and immunosuppressive drugs. Predictors of severe COVID-19 outcomes within the RA population have not yet been explored in a real-world setting.
OBJECTIVES: To identify the most influential predictors of severe COVID-19 within the RA population.
STUDY DESIGN: Retrospective cohort study.
SETTING: Research was conducted using Optum’s de-identified Clinformatics® Data Mart Database (2000-2021Q1), a US commercial claims database.
METHODS: We identified adult patients with index COVID-19 (ICD-10-CM diagnosis code U07.1) between March 1, 2020, and December 31, 2020. Patients were required to have continuous enrollment and have evidence of one inpatient or 2 outpatient diagnoses of RA in the 365 days prior to index. RA patients with COVID-19 were stratified by outcome (mild vs severe), with severe cases defined as having one of the following within 60 days of COVID-19 diagnosis: death, treatment in the intensive care unit (ICU), or mechanical ventilation. Baseline demographics and clinical characteristics were extracted during the 365 days prior to index COVID-19 diagnosis. To control for improving treatment options, the month of index date was included as a potential independent variable in all models. Data were partitioned (80% train and 20% test), and a variety of machine learning algorithms (logistic regression, random forest, support vector machine [SVM], and XGBoost) were constructed to predict severe COVID-19, with model covariates ranked according to importance.
RESULTS: Of 4,295 RA patients with COVID-19 included in the study, 990 (23.1%) were classified as severe. RA patients with severe COVID-19 had a higher mean age (mean [SD] = 71.6 [10.3] vs 63.4 [13.7] years, P < 0.001) and Charlson Comorbidity Index (CCI) (3.8 [2.4] vs 2.4 [1.8], P < 0.001) than those with mild cases. Males were more likely to be a severe case than mild (29.1% vs 18.5%, P < 0.001). The top 15 predictors from the best performing model (XGBoost, AUC = 75.64) were identified. While female gender, commercial insurance, and physical therapy were inversely associated with severe COVID-19 outcomes, top predictors included a March index date, older age, more inpatient visits at baseline, corticosteroid or gamma-aminobutyric acid analog (GABA) use at baseline or the need for durable medical equipment (i.e., wheelchairs), as well as comorbidities such as congestive heart failure, hypertension, fluid and electrolyte disorders, lower respiratory disease, chronic pulmonary disease, and diabetes with complication.
LIMITATIONS: The cohort meeting our eligibility criteria is a relatively small sample in the context of machine learning. Additionally, diagnoses definitions rely solely on ICD-10-CM codes, and there may be unmeasured variables (such as labs and vitals) due to the nature of the data. These limitations were carefully considered when interpreting the results.
CONCLUSIONS: Predictive baseline comorbidities and risk factors can be leveraged for early detection of RA patients at risk of severe COVID-19 outcomes. Further research should be conducted on modifiable factors in the RA population, such as physical therapy.
KEY WORDS: COVID-19, rheumatoid arthritis, machine learning, real-world data, predictive modeling, RA, SARS-CoV-2, real-world evidence, physical therapy, corticosteroid use