Session Information
Session Type: ACR Poster Session C
Session Time: 9:00AM-11:00AM
Background/Purpose:
Selecting the best model in an epidemiologic analysis is challenging as it addresses problems like confounding and allows the estimation of unbiased results. Stepwise selection is the most commonly used method but also the most criticized, as it relies on an arbitrary threshold, the probability of removal, to decide which variables will be included in the model. Modern shrinkage techniques such as the Least Absolute Shrinkage and Selection Operator (LASSO) may address this issue. Using a rheumatology patient registry, we compared the LASSO method with traditional regression model techniques including stepwise and change-in-estimate.
Methods:
LASSO can be used with several statistical models including generalized linear models and proportional hazards models. In the latter, it maximizes the partial likelihood of regression coefficients subject to a constraint imposed on the sum of the absolute value of all regression coefficients. It excludes variables without formal statistical testing by correcting the extremes in the distribution and shrinking unstable estimates to zero. The constraint can be estimated via cross-validation. We illustrate this technique in a sample of patients enrolled in the National Data Bank for Rheumatic Diseases from 2001 to 2016. We applied survival methods to assess the risk of serious infections (SI) in patients with RA compared to non-inflammatory rheumatic disease (NIRD) controls. Variables included demographics, clinical status, disease severity and prednisone use.
Results:
20,361 RA and 6176 NIRD patients contributed to 81,499 and 20,665 patient-years of exposure, having had 1600 (7.9%) and 276 (4.5%) SI, respectively (incidence rate ratio: 1.5 [1.3-1.7]). Baseline characteristics by disease (RA:NIRD) were age (58:63 yrs), female sex (79:80%) and HAQ (1.11: 1.10). The LASSO HR of SI comparing RA vs NIRD was 1.15 (0.99 – 1.32), being prednisone, vaccination, and prior infections the variables with the highest impact (Figure). Similar results were obtained with stepwise, selecting almost the same covariates. However, when using different thresholds, slightly different models were obtained (Table).
Conclusion:
Identical estimates were obtained across methods, not imposing arbitrary thresholds with LASSO. Although LASSO has many positive attributes, such as being very powerful when variables exceed observations, it is still rarely applied in epidemiologic studies. It is a valid alternative to the popular stepwise, as it is less variable and yields interpretable models.
Table. Best Cox models selected for serious infections accordingly to several selection techniques.
|
||
Method
|
HR (RA vs NIRD)
|
Other variables selected
|
LASSO |
1.15 (0.99-1.32) |
Ethnicity (white vs. other), sex, smoking status, age, disease duration, residency (urban vs. rural), prior infections, RD comorbidity index, HAQ, pain, education, prednisone, vaccination, diabetes |
Change-in-estimate (10% change)
|
1.13 (0.98-1.30) |
Prednisone |
Change-in-estimate (2% change)
|
1.17 (1.01-1.34) |
Prednisone, pain, age, education |
Stepwise (20% removal probability)
|
1.14 (1.0-1.31) |
Ethnicity (white vs. other), sex, smoking status, age, disease duration, residency (urban vs. rural), prior infections, RD comorbidity index, HAQ, pain, education, prednisone, vaccination, diabetes |
Stepwise (10% removal probability) |
1.15 (1.0-1.32) |
Ethnicity (white vs. other), sex, smoking status, age, disease duration, prior infections, RD comorbidity index, HAQ, pain, education, prednisone, vaccination, diabetes |
Figure. LASSO coefficients estimates by the regularization term.
To cite this abstract in AMA style:
Pedro S, Mehta B, Ozen G, Michaud K. The Lasso Selection Model in Rheumatology Epidemiologic Studies [abstract]. Arthritis Rheumatol. 2017; 69 (suppl 10). https://acrabstracts.org/abstract/the-lasso-selection-model-in-rheumatology-epidemiologic-studies/. Accessed .« Back to 2017 ACR/ARHP Annual Meeting
ACR Meeting Abstracts - https://acrabstracts.org/abstract/the-lasso-selection-model-in-rheumatology-epidemiologic-studies/