Session Information
Date: Sunday, October 26, 2025
Title: Abstracts: Systemic Sclerosis & Related Disorders – Clinical I (0843–0848)
Session Type: Abstract Session
Session Time: 3:15PM-3:30PM
Background/Purpose: As treatment options for diffuse cutaneous systemic sclerosis (dcSSc) expand, the need for data-driven, efficient approaches to therapeutic switching is becoming more urgent. Additionally, clinical trials may choose to add early exit strategies for patients highly unlikely to improve so that other treatment options can be explored. The purpose of this study is to develop a model for predicting 52-week cutaneous improvement in dcSSc using clinical data collected at baseline, 14, and 26 weeks.
Methods: Clinical data were analyzed from 361 individuals enrolled in RESOLVE-1 (Lenabasum Phase III trial). Cutaneous improvement was defined as ≥5-point decrease in modified Rodnan skin score (mRSS) from baseline to 52 weeks. Characteristics were compared between improvers and non-improvers using Fisher’s exact or Wilcoxon rank-sum, as appropriate. A gradient boosting, supervised machine learning model was developed for predicting improvement. 43 candidate predictors were selected by expert opinion and included baseline, 14-, and 26-week data. Model discrimination for classifying improvement status was assessed using area under the receiver operating characteristic curve (AUROC) with five-fold cross-validation. To enhance interpretability, the Shapley Additive exPlanations (SHAP) method was used to visualize the magnitude and directionality of features’ individual impact on predictions. A reliability plot was generated to compare predictions to observed outcomes. Observed 52-week improvement was plotted by baseline to 26-week mRSS change.
Results: Of 361 patients with dcSSc, 235 (65%) improved. Immunosuppression (any) and mycophenolate mofetil use were more common among improvers vs. non-improvers (Table 1). By 26-weeks, 52-week improvers had lower physician global (p< 0.001), 5D-Itch (p=0.012), and Scleroderma Skin Patient Reported Outcome (total score) (p=0.028) vs. non-improvers. By both week 14 and 26, 52-week improvers had significantly lower mRSS vs. non-improvers (17 vs. 18, p=0.025; 15 vs. 18, p< 0.001). The model accurately predicted 52-week cutaneous improvement with a cross-validated AUROC of 0.78 (Fig. 1A) and good reliability (Fig. 1D). The top 10 predictive features by mean SHAP value are shown in Fig. 1B, with mRSS at week 26 and baseline as the two most important features. Higher week-26 mRSS contributed to lower improvement prediction, whereas higher baseline mRSS contributed to higher improvement prediction (Fig. 1C). Predicted probabilities for improvers and non-improvers are shown in Fig. 1E. Applying a probability threshold of 0.25 identifies improvers with a sensitivity of 97% (i.e., only 3% of improvers below 0.25 threshold at 26 weeks) and a negative predictive value of 0.86 (i.e., of those predicted to be non-improvers, 86% did not improve). 52-week improvement was uncommon among those who had not experienced any mRSS decline by week 26 (Fig. 1F).
Conclusion: Improvement in mRSS at 52 weeks can be reliably predicted using baseline and early follow-up data, identifying those unlikely to improve on current therapy. This model may inform personalized clinical decision making as well as adaptive trial designs with early exit strategies for those highly unlikely to improve.
Figure 1. Baseline and early response data to predict 52-week clinical improvement among individuals with diffuse cutaneous systemic sclerosis: Performance and interpretability of gradient boosting model. A. Receiver Operating Characteristic Curve for machine learning model to predict 52-week clinical improvement (at least 5-point modified Rodnan skin score (mRSS) improvement at week 52) with cross-validated area under the curve (AUC) of 0.78. B. Top 10 features ranked by mean SHAP importance. (C) SHAP summary plot illustrates individual feature effects on model predictions. Each dot represents a single patient; red represents high values; blue represents low values. Positive SHAP values contribute to higher probability of clinical improvement, while negative SHAP values contribute to lower probability. D. Reliability plot indicating good agreement between predicted and observed outcomes. E. Histogram of predicted probabilities for 52-week improvers and non-improvers with threshold of 0.25 shown. F. Frequency of 52-week improvement according to baseline to 26-week mRSS change.
To cite this abstract in AMA style:
Lakin K, Spivack J, Gordon J, Orange D, Spiera R. Machine Learning Model Incorporating Baseline and Early Follow-up Clinical Data Predicts 52-Week Cutaneous Outcomes in Systemic Sclerosis [abstract]. Arthritis Rheumatol. 2025; 77 (suppl 9). https://acrabstracts.org/abstract/machine-learning-model-incorporating-baseline-and-early-follow-up-clinical-data-predicts-52-week-cutaneous-outcomes-in-systemic-sclerosis/. Accessed .« Back to ACR Convergence 2025
ACR Meeting Abstracts - https://acrabstracts.org/abstract/machine-learning-model-incorporating-baseline-and-early-follow-up-clinical-data-predicts-52-week-cutaneous-outcomes-in-systemic-sclerosis/