ACR Meeting Abstracts

ACR Meeting Abstracts

  • Meetings
    • ACR Convergence 2024
    • ACR Convergence 2023
    • 2023 ACR/ARP PRSYM
    • ACR Convergence 2022
    • ACR Convergence 2021
    • ACR Convergence 2020
    • 2020 ACR/ARP PRSYM
    • 2019 ACR/ARP Annual Meeting
    • 2018-2009 Meetings
    • Download Abstracts
  • Keyword Index
  • Advanced Search
  • Your Favorites
    • Favorites
    • Login
    • View and print all favorites
    • Clear all your favorites
  • ACR Meetings

Abstract Number: 2910

Improving the Efficiency of Clinical Trial Recruitment Using Electronic Health Record Data, Natural Language Processing, and Machine Learning

Tianrun Cai1, Fiona Cai 2, Kumar Dahal 3, Chuan Hong 4 and Katherine Liao 1, 1Brigham and Women's Hospital, Boston, 2Stuyvesant High School, New York, 3Brigham and Women's Hospital, Boston, MA, 4Harvard Medical School, Boston, MA

Meeting: 2019 ACR/ARP Annual Meeting

Keywords: clinical trials, Electronic Health Record, recruitment and rheumatoid arthritis (RA)

  • Tweet
  • Email
  • Print
Session Information

Date: Wednesday, November 13, 2019

Title: 6W021: RA – Treatments V: Switching & Tapering RA Medications (2906–2911)

Session Type: ACR Abstract Session

Session Time: 11:00AM-12:30PM

Background/Purpose: Efficiently identifying eligible patients is an important component of a successful clinical trial.  Billing codes from electronic health record (EHR) data are commonly used to first screen for potential patients, followed by labor-intensive chart review to identify the eligible patients by trial criteria.  The objective of this study was to test whether a machine learning screening algorithm (ML-screen) incorporating ICD codes and data extracted from notes using natural language processing (NLP), could improve the efficiency for identifying eligible patients for an ongoing clinical trial.

Methods: We studied EHR data used for a clinical recruitment study of rheumatoid arthritis (RA) and cardiovascular disease recruiting from a tertiary care center (TCC) and a community hospital (CH).  The target population were RA patients, age >35, about to initiate a tumor necrosis factor inhibitor, and not on a statin.  Prior to this study all patients with ≥1 RA ICD codes (RAICD) and age >35 years were selected for chart review.  The CH and TCC data sets were both manually reviewed as gold standard labels including 642 and 2387 patients, respectively.   All notes were processed with NLP to obtain the number of mentions for the concept of RA and inflammatory arthritis. Three groups of features were considered for the ML-screen (Table 1): (1) inclusion criteria features, e.g. RAICD; (2) exclusion criteria features, e.g. # of electronic prescriptions for a statin; (3) the total # ICD codes as a proxy for healthcare utilization.  For the ML-screen we considered features within a 2-year timeframe prior to the chart review as well as all years prior.   The ML-screen combined two ML methods, random forest (RF) and penalized logistic regression.  The goal for the ML-screen was to reduce the number of patients requiring chart review without excluding potentially eligible patients.  The ML-screen was compared to alternative approaches using RAICD ≥1, RAICD ≥2, and RAICD ≥1+exclusion criteria features. To test whether the ML-screen can be successfully ported to other institutions, we trained at TCC and applied at CH, and vice versa.

Results: The current method reviewing all charts with RAICD≥1 yielded 346 (14.5%) eligible patients out of 2387 at TCC, and 74 (16.0%) out of 642 at CH.  Applying the ML-screen would result in reviewing 33% less patients in TCC and 44% less in CH, compared to RAICD ≥1, without screening out potentially eligible patients (Table 2).  In contrast, RAICD ≥2 high sensitivity 0.93-0.98, but did not reduce as many patients for chart review, 2.7-11.3%.  The RAICD ≥1+exclusion yielded a larger reduction of patients for review, 63-65%, however excluded approximately 22-27% of eligible patients. The ML-screen had similar performance when trained on one institution and tested on the other (Table 3).

Conclusion: The ML-screen incorporating EHR and NLP data can increase the efficiency of clinical trial recruitment by reducing the number of patients requiring chart review; importantly, this approach did not screen out eligible patients.  Moreover, the ML-screen can be trained at one institution and applied at another for multi-center clinical trials.


Table 1

Table 1. Features used in the ML-screen for clinical trial recruitment.


table2

Table 2. Comparison of performance between a screen developed using machine learning vs ICD only screens


Table 3

Table 3. Comparison of performance for MLS algorithm across institutions


Disclosure: T. Cai, None; F. Cai, None; K. Dahal, None; C. Hong, None; K. Liao, None.

To cite this abstract in AMA style:

Cai T, Cai F, Dahal K, Hong C, Liao K. Improving the Efficiency of Clinical Trial Recruitment Using Electronic Health Record Data, Natural Language Processing, and Machine Learning [abstract]. Arthritis Rheumatol. 2019; 71 (suppl 10). https://acrabstracts.org/abstract/improving-the-efficiency-of-clinical-trial-recruitment-using-electronic-health-record-data-natural-language-processing-and-machine-learning/. Accessed .
  • Tweet
  • Email
  • Print

« Back to 2019 ACR/ARP Annual Meeting

ACR Meeting Abstracts - https://acrabstracts.org/abstract/improving-the-efficiency-of-clinical-trial-recruitment-using-electronic-health-record-data-natural-language-processing-and-machine-learning/

Advanced Search

Your Favorites

You can save and print a list of your favorite abstracts during your browser session by clicking the “Favorite” button at the bottom of any abstract. View your favorites »

All abstracts accepted to ACR Convergence are under media embargo once the ACR has notified presenters of their abstract’s acceptance. They may be presented at other meetings or published as manuscripts after this time but should not be discussed in non-scholarly venues or outlets. The following embargo policies are strictly enforced by the ACR.

Accepted abstracts are made available to the public online in advance of the meeting and are published in a special online supplement of our scientific journal, Arthritis & Rheumatology. Information contained in those abstracts may not be released until the abstracts appear online. In an exception to the media embargo, academic institutions, private organizations, and companies with products whose value may be influenced by information contained in an abstract may issue a press release to coincide with the availability of an ACR abstract on the ACR website. However, the ACR continues to require that information that goes beyond that contained in the abstract (e.g., discussion of the abstract done as part of editorial news coverage) is under media embargo until 10:00 AM ET on November 14, 2024. Journalists with access to embargoed information cannot release articles or editorial news coverage before this time. Editorial news coverage is considered original articles/videos developed by employed journalists to report facts, commentary, and subject matter expert quotes in a narrative form using a variety of sources (e.g., research, announcements, press releases, events, etc.).

Violation of this policy may result in the abstract being withdrawn from the meeting and other measures deemed appropriate. Authors are responsible for notifying colleagues, institutions, communications firms, and all other stakeholders related to the development or promotion of the abstract about this policy. If you have questions about the ACR abstract embargo policy, please contact ACR abstracts staff at [email protected].

Wiley

  • Online Journal
  • Privacy Policy
  • Permissions Policies
  • Cookie Preferences

© Copyright 2025 American College of Rheumatology