ACR Meeting Abstracts

ACR Meeting Abstracts

  • Meetings
    • ACR Convergence 2024
    • ACR Convergence 2023
    • 2023 ACR/ARP PRSYM
    • ACR Convergence 2022
    • ACR Convergence 2021
    • ACR Convergence 2020
    • 2020 ACR/ARP PRSYM
    • 2019 ACR/ARP Annual Meeting
    • 2018-2009 Meetings
    • Download Abstracts
  • Keyword Index
  • Advanced Search
  • Your Favorites
    • Favorites
    • Login
    • View and print all favorites
    • Clear all your favorites
  • ACR Meetings

Abstract Number: 0492

A Machine Learning Model for the Early Identification of Rheumatoid Arthritis: Development and Validation

Michael Dreyfuss1, Yonatan Jenudi2, Dan Riesel3, Or Ramni3, Daniel Underberger4, Benjamin Getz5, Shlomit Steinberg-Koch3, Douglas White6 and Elena Myasoedova7, 1Predicta Med, Ramat Gan, Israel, 2[email protected], Ramat Gan, Israel, 3Predicta Med Analytics Ltd., Ramat Gan, Israel, 4Predicta Med Analytics Ltd., Bridgeport, CT, 5CTO, Ramat Gan, Israel, 6Gundersen Health System, Onalaska, WI, 7Mayo Clinic, Rochester, MN

Meeting: ACR Convergence 2024

Keywords: autoimmune diseases, Bioinformatics, rheumatoid arthritis, risk assessment, Statistical methods

  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
Session Information

Date: Saturday, November 16, 2024

Title: RA – Diagnosis, Manifestations, & Outcomes Poster I

Session Type: Poster Session A

Session Time: 10:30AM-12:30PM

Background/Purpose: Patients with rheumatoid arthritis often experience clinically significant delays in diagnosis (Sørensen et al., 2015; Raza et al., 2011). RA can present similarly to other types of inflammatory arthritis, and can therefore be challenging for a primary care physician (PCP) to recognize (Saraiva et al., 2023). Indeed, the longest delay is after initial presentation to the PCP till evaluation by rheumatology (Barhamian et al., 2017; Stack et al., 2019). Digital tools such as machine learning algorithms have the potential to help physicians identify patients with undiagnosed RA earlier in the course of disease. In this study we describe the development and validation of a novel machine learning model for identifying patients in the community who may be at risk of having undiagnosed RA.

Methods: Patients from the community population (n=395,918 patients) at Mayo Clinic between 2012 and 2022 were split into training and validation sets. Cases with RA and controls with no evidence of RA were identified in both sets. Prediction dates for model training and evaluation were set at six month intervals on the 1st of January and July of each year. Cases were assigned to the prediction date directly preceding autoantibody testing before their first diagnosis of RA. This design was chosen to expose the model to information preceding clinical suspicion for disease. Controls were randomly assigned to prediction dates based on data eligibility. A gradient boosted trees algorithm was trained using electronic medical record (EMR) data documented during the two years prior to each patient’s prediction date. Input features included information from the structured data (age, sex, diagnosis codes, medication prescriptions and laboratory results), and symptoms and signs that were extracted from clinical notes by natural language processing (NLP). The model was then evaluated on the validation set, and area under the curve (AUC) was used to assess the model’s ability to discriminate between new cases of RA and controls

Results: The validation set included 145 patients with RA (108 females; mean age 55.7, standard deviation [SD] 16.3) and 17,702 control patients (9,758 females; mean age 49.3, SD 17.0). The AUC on the validation set was 77.9% (fig 1). Symptoms and signs documented in clinical notes and diagnosis codes were important predictive features, including arthritis, pain and swelling in various joints, enthesopathies and synovitis (fig 2.). Additional contributing features included elevated inflammatory markers and glucocorticoid and NSAID use.

Conclusion: The model displayed good performance in its ability to discriminate between cases of RA and controls. Implementation of the model may help PCPs identify undiagnosed RA in the primary care population using existing data from the EMR. Improving time to diagnosis could help patients receive treatment and reduce downstream sequelae from untreated disease. Features from structured data and unstructured data contributed to model performance. The important contribution of features extracted by NLP from clinical documents suggests that further improvements in model performance may come from refined NLP techniques.

Supporting image 1

Receiver operating characteristics curve displaying the discriminative ability of the model over all thresholds.

Supporting image 2

SHAP plot displaying the top 25 contributing features to the model. Features describing clinical signs and symptoms are extracted from clinical texts unless explicitly indicated that they are from diagnosis codes (‘dx’). Female is coded as 1 for sex.


Disclosures: M. Dreyfuss: Predicta Med Analytics Ltd., 3, 8; Y. Jenudi: Predicta Med Analytics Ltd., 3, 8; D. Riesel: Predicta Med Analytics Ltd., 1, 3, 8, 8; O. Ramni: Predicta Med Analytics Ltd., 3, 8; D. Underberger: Predicta Med Analytics Ltd., 3, 8; B. Getz: Predicta Med Analytics Ltd., 3, 8; S. Steinberg-Koch: Predicta Med Analytics Ltd., 3, 8; D. White: Predicta Med Analytics Ltd., 1, 8; E. Myasoedova: None.

To cite this abstract in AMA style:

Dreyfuss M, Jenudi Y, Riesel D, Ramni O, Underberger D, Getz B, Steinberg-Koch S, White D, Myasoedova E. A Machine Learning Model for the Early Identification of Rheumatoid Arthritis: Development and Validation [abstract]. Arthritis Rheumatol. 2024; 76 (suppl 9). https://acrabstracts.org/abstract/a-machine-learning-model-for-the-early-identification-of-rheumatoid-arthritis-development-and-validation/. Accessed .
  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print

« Back to ACR Convergence 2024

ACR Meeting Abstracts - https://acrabstracts.org/abstract/a-machine-learning-model-for-the-early-identification-of-rheumatoid-arthritis-development-and-validation/

Advanced Search

Your Favorites

You can save and print a list of your favorite abstracts during your browser session by clicking the “Favorite” button at the bottom of any abstract. View your favorites »

All abstracts accepted to ACR Convergence are under media embargo once the ACR has notified presenters of their abstract’s acceptance. They may be presented at other meetings or published as manuscripts after this time but should not be discussed in non-scholarly venues or outlets. The following embargo policies are strictly enforced by the ACR.

Accepted abstracts are made available to the public online in advance of the meeting and are published in a special online supplement of our scientific journal, Arthritis & Rheumatology. Information contained in those abstracts may not be released until the abstracts appear online. In an exception to the media embargo, academic institutions, private organizations, and companies with products whose value may be influenced by information contained in an abstract may issue a press release to coincide with the availability of an ACR abstract on the ACR website. However, the ACR continues to require that information that goes beyond that contained in the abstract (e.g., discussion of the abstract done as part of editorial news coverage) is under media embargo until 10:00 AM ET on November 14, 2024. Journalists with access to embargoed information cannot release articles or editorial news coverage before this time. Editorial news coverage is considered original articles/videos developed by employed journalists to report facts, commentary, and subject matter expert quotes in a narrative form using a variety of sources (e.g., research, announcements, press releases, events, etc.).

Violation of this policy may result in the abstract being withdrawn from the meeting and other measures deemed appropriate. Authors are responsible for notifying colleagues, institutions, communications firms, and all other stakeholders related to the development or promotion of the abstract about this policy. If you have questions about the ACR abstract embargo policy, please contact ACR abstracts staff at [email protected].

Wiley

  • Online Journal
  • Privacy Policy
  • Permissions Policies
  • Cookie Preferences

© Copyright 2025 American College of Rheumatology