Date: Monday, November 9, 2020
Session Type: Poster Session D
Session Time: 9:00AM-11:00AM
Background/Purpose: In the ACTION (NCT02109666) study, using multivariable Cox proportional hazards regression models, patient (pt) global assessment of pain, country, reason for stopping last biologic, number of prior biologic treatments (txs), abatacept (ABA) monotherapy, RF/anti-CCP status, previous neoplasms, psychiatric disorders and cardiac disorders were identified as predictors of 1-year retention to ABA tx.1 The objective of our study was to use machine learning as an innovative and complementary approach in order to identify pts with 1-year ABA retention.
Methods: Supervised learning was used for classification. A binary variable was used where retention=1 and no retention=0. Retention was defined as tx for >365 days or ≤365 days in pts who achieved remission or major clinical response. The number of variables identified and used in the models is shown in Figure 1. A subset of features was selected to prevent under-fitting or over-fitting. Label and OneHot encoding were applied for categorical variables and MinMax scaling was applied to convert all continuous variables to the same scale. The models tested for predictive performance were: logistic regression, support vector machine, naïve Bayes, decision tree, random forest, gradient boosting and multi-layer perceptron. For each model, a recursive feature elimination with cross validation was applied. For the best performing model (gradient boosting), the database was divided into two sets: a training/validation set (n=2021) and a test set (n=329). Accuracy was defined as the number of correct predictions divided by the total number of predictions. Precision, recall and F1-score were estimated for predictions on both retention and no retention and were then weight-averaged to obtain overall performance scores. The importance score of each variable was estimated.
Results: In total, 2350 pts included from May 2008 to December 2013 had a mean retention rate of 59.3% at 1 year. The gradient boosting classifier model had the best prediction testing accuracy (67%) and was the most interpretable model for feature importance (Table 1). The importance score for each predictor did not describe a linear correlation with retention or directionality; rather, the higher the score, the more important the variable for retention prediction. Predictive information was shared between all 51 variables; 8 of which overlapped with those identified in the Cox regression model.1 The five most influencing variables were the duration of previous biologic DMARDs, pt global pain assessment, RA duration, physician global disease activity assessment and HAQ-DI.
Conclusion: The gradient-boosting model identified predictors of retention in addition to those identified by multivariable Cox regression models in ACTION.1 The models and predictors identified could be further improved by including other RA datasets. Machine learning offers a complementary approach to biostatistics and may lead to better identification of pts with RA, and their tx retention, hence supporting personalized, clinical decision making in a real-world setting.
- Alten R, et al. RMD Open 2017;3:e000538.
Medical writing: Claire Line (Caudex)
To cite this abstract in AMA style:Alten R, Behar C, Boileau C, Merckaert P, Afari E, Vannier-Moreau V, Connolly S, Elbez Y, Juge P, Lozenski K. A Novel Method for Predicting 1-Year Retention of Abatacept Using Machine Learning Techniques [abstract]. Arthritis Rheumatol. 2020; 72 (suppl 10). https://acrabstracts.org/abstract/a-novel-method-for-predicting-1-year-retention-of-abatacept-using-machine-learning-techniques/. Accessed January 18, 2022.
« Back to ACR Convergence 2020
ACR Meeting Abstracts - https://acrabstracts.org/abstract/a-novel-method-for-predicting-1-year-retention-of-abatacept-using-machine-learning-techniques/