Session Type: Poster Session C
Session Time: 9:00AM-11:00AM
Background/Purpose: Renal flares in patients with SLE result in significant nephron loss. Thus, identification of reliable early signals of impending renal flares is anticipated to improve the prognosis for these patients. Machine learning (ML) has witnessed growing utilisation in managing extensive, heterogeneous datasets, and non-linear relationships. In this study, we implemented two different approaches of ML algorithms to identify baseline clinical and laboratory determinants of renal flare occurrence in a large cohort of SLE.
Methods: We analysed data from five phase III trials (BLISS-52, BLISS-76, BLISS-NEA, BLISS-SC, EMBRACE) after exclusion of patients with baseline renal flare BILAG A or B (N=3169). Renal flares were defined as a change from C, D, or E to A or B in the renal domain of the classic BILAG index within a 52-week long follow-up. Following construction of panels of variables using either (i) knowledge or (ii) feature selection methods, we developed ML classifiers including extreme gradient boosting (XGBoost), least absolute shrinkage and selection operator (LASSO), random forest (RF), and multivariable logistic regression. A stratified split was applied to the data to partition the study population into a training (90%; N=2853) and a test set (10%; N=316). The training set was used in model development while the internal validation was developed by a 10 times 10-fold cross validation. The test set was used for external validation of the built model, and the performance of the models was demonstrated using area under the curve (AUC) of the receiver operating curves (ROC), accuracy with a 95% confidence interval (CI), sensitivity, and specificity metrics.Both approaches yielded final models that utilised the minimal number of features while maintaining optimal model performance.
Results: Of 3169 patients, 899 (28.3%) developed a renal flare during follow-up. XGBoost yielded the greatest accuracy both in the hypothesis-driven (0.97 and the data-driven approach (0.88), as well as the highest performance metrics (AUC: 0.97 and 0.91; sensitivity: 1.00 and 0.82; specificity: 0.94 and 0.94, respectively) and an adequate calibration on the test dataset. LASSO (accuracy: 0.95 and 0.86 ; AUC: 0.97 and 0.96; sensitivity: 1.00 and 0.86; specificity: 0.91 and 0.86, respectively) demonstrated similar performance. The final model successfully reduced the number of features to five parameters: renal BILAG C or D score, urine protein creatinine ratio, serum albumin, blood urea nitrogen, and C3 levels. These models exhibited encouraging performance, with AUC values of 0.88, 0.88, 0.88, and 0.87 for XGBoost, LASSO, logistic regression, and RF, respectively.
Conclusion: The knowledge-driven approach based on clinical expertise outperformed the unsupervised data-driven approach which solely relied on feature selection methods. Through utilisation of five routine clinical parameters, we developed a robust and highly accurate prediction tool for forecasting renal flares in patients with SLE. Our ML-based model holds substantial value in guiding clinical decision-making to personalise patient management and possesses potential for practical application in clinical settings.
To cite this abstract in AMA style:Cetrez N, Lindblom J, Da Mutten R, Nikolopoulos D, Parodis I. Machine Learning Approaches for Prediction of Renal Flares in Systemic Lupus Erythematosus: Knowledge-Driven Models Outperformed Data-Driven Models [abstract]. Arthritis Rheumatol. 2023; 75 (suppl 9). https://acrabstracts.org/abstract/machine-learning-approaches-for-prediction-of-renal-flares-in-systemic-lupus-erythematosus-knowledge-driven-models-outperformed-data-driven-models/. Accessed .
« Back to ACR Convergence 2023
ACR Meeting Abstracts - https://acrabstracts.org/abstract/machine-learning-approaches-for-prediction-of-renal-flares-in-systemic-lupus-erythematosus-knowledge-driven-models-outperformed-data-driven-models/