Session Information
Date: Monday, November 9, 2020
Title: RA – Diagnosis, Manifestations, & Outcomes Poster IV: Lifespan of a Disease
Session Type: Poster Session D
Session Time: 9:00AM-11:00AM
Background/Purpose: Flare, relapse from status of treat-to-target (T2T, DAS28< =3.2), is hard predicted. We try to make it predictable by applying machine learning to a database from smart system of disease management (SSDM). SSDM is an interactive mobile disease management APPs. The aim of this study is to develop and validate machine learning algorithms for flare prediction in RA.
Methods: Patients were trained using SSDM and input their data, including demographic, comorbidities (COMBs), lab test, medications and monthly self-assessments, including DAS28, HAQ, SF-36, Hospital Anxiety and Depression Scale (HADS). The data was uploaded to cloud and synchronized to the mobile of authorized rheumatologists. The COMBs were by ICD-9, and medications were listed as cDMARDs, Bio (BioDMARDs), NSAIDs, Steroid, FS (food supplements), MC (medicine for COMBs), TCM (Traditional Chinese Medicine), and combinations.
Results: From Jan of 2015 to Jan of 2020, 8811 RA patients, 85% female and 15% male, used to reach T2T. 4556 were flare-free and 4255 suffering at least one flare. The average 160 attributes were extracted from each flare-free patient at time of reaching T2T, and each flare patients at time of 3 months before the flare. Patients were randomly assigned as model setup (training) group (70%) and validation (testing) group30%.
For training, data were processed using Python with statistical analyses in R. In R, random forests were implemented. Logistic regression via glm in base R. The random forest comprises a set of decision trees. “Splits” in the decision trees reflect binary (i.e., yes/no) respect to attributors. Bootstrapping was used to assess, quantify, and adjust for model optimism. Model performance was evaluated using AUC, precision and recall metrics. Brier scores for accuracy of probabilistic predictions ranged from 0 to 1 (0 is perfect discrimination).
The testing showed model performance for prediction windows are 0.78 for AUC (95% CI), 0.71 for Recall (sensitivity), 0.195 for Brier score, and 0.68 for precision (true positive 893, false positive 417, false negative 367, true negative 966).
Based on weighing in the random forest, the top 10 pro-flare attributes were CRP, swollen joint count (SJC), tender joint count (TJC), HAQ, DAS28, morning stiffness, gout, MCTD, OA, duration; while top 10 anti-flare attributes were cDMARDs+Bio, cDMARDs+steroid+NSAIDs, stable on HAQ, on morning stiffness, on SJC, medicine on COMBs, cDMARDs+TCM, stable on TJC, on ESR, income at 100-200k (Fig.1). The top weighing COMBs for pro-flaring were gout (0.81), MRD (0.75), OA (0.56), AS (0.48). The monotherapies with either Bio or NSAIDs, or steroid, or TCM was pro-flare; while with cDMARDs was anti-flare (-0.21).
Conclusion: The attempt to develop a machine learning algorithm for RA flare prediction is successful. The discrimination was acceptable. The attributes of both pro-flare and anti-flare are identified, which may inspire the proactive intervention.
Figure 1: Coefficient of features influencing RA relapse.
To cite this abstract in AMA style:
Zhao Y, Mu R, Li X, Sun H, Mi C, Wang G, Xu S, Xu M, Chen H, Huang Q, Lei L, Shen H, Xiao H, Jia Y, Wu B, Chen X, Jia S, Xiao F. RA Flare Prediction via Machine Learning and Algorithm Based on SSDM Big Data [abstract]. Arthritis Rheumatol. 2020; 72 (suppl 10). https://acrabstracts.org/abstract/ra-flare-prediction-via-machine-learning-and-algorithm-based-on-ssdm-big-data/. Accessed .« Back to ACR Convergence 2020
ACR Meeting Abstracts - https://acrabstracts.org/abstract/ra-flare-prediction-via-machine-learning-and-algorithm-based-on-ssdm-big-data/