ACR Meeting Abstracts

ACR Meeting Abstracts

  • Home
  • Meetings Archive
    • ACR Convergence 2022
    • ACR Convergence 2021
    • ACR Convergence 2020
    • 2020 ACR/ARP PRSYM
    • 2019 ACR/ARP Annual Meeting
    • 2018 ACR/ARHP Annual Meeting
    • 2017-2009 Meetings
    • Download Abstracts
  • Keyword Index
  • Advanced Search
  • Your Favorites
    • Favorites
    • Login
    • View and print all favorites
    • Clear all your favorites
  • Meeting Resource Center

Abstract Number: 2910

Improving the Efficiency of Clinical Trial Recruitment Using Electronic Health Record Data, Natural Language Processing, and Machine Learning

Tianrun Cai1, Fiona Cai 2, Kumar Dahal 3, Chuan Hong 4 and Katherine Liao 1, 1Brigham and Women's Hospital, Boston, 2Stuyvesant High School, New York, 3Brigham and Women's Hospital, Boston, MA, 4Harvard Medical School, Boston, MA

Meeting: 2019 ACR/ARP Annual Meeting

Keywords: clinical trials, Electronic Health Record, recruitment and rheumatoid arthritis (RA)

  • Tweet
  • Email
  • Print
Session Information

Date: Wednesday, November 13, 2019

Session Title: 6W021: RA – Treatments V: Switching & Tapering RA Medications (2906–2911)

Session Type: ACR Abstract Session

Session Time: 11:00AM-12:30PM

Background/Purpose: Efficiently identifying eligible patients is an important component of a successful clinical trial.  Billing codes from electronic health record (EHR) data are commonly used to first screen for potential patients, followed by labor-intensive chart review to identify the eligible patients by trial criteria.  The objective of this study was to test whether a machine learning screening algorithm (ML-screen) incorporating ICD codes and data extracted from notes using natural language processing (NLP), could improve the efficiency for identifying eligible patients for an ongoing clinical trial.

Methods: We studied EHR data used for a clinical recruitment study of rheumatoid arthritis (RA) and cardiovascular disease recruiting from a tertiary care center (TCC) and a community hospital (CH).  The target population were RA patients, age >35, about to initiate a tumor necrosis factor inhibitor, and not on a statin.  Prior to this study all patients with ≥1 RA ICD codes (RAICD) and age >35 years were selected for chart review.  The CH and TCC data sets were both manually reviewed as gold standard labels including 642 and 2387 patients, respectively.   All notes were processed with NLP to obtain the number of mentions for the concept of RA and inflammatory arthritis. Three groups of features were considered for the ML-screen (Table 1): (1) inclusion criteria features, e.g. RAICD; (2) exclusion criteria features, e.g. # of electronic prescriptions for a statin; (3) the total # ICD codes as a proxy for healthcare utilization.  For the ML-screen we considered features within a 2-year timeframe prior to the chart review as well as all years prior.   The ML-screen combined two ML methods, random forest (RF) and penalized logistic regression.  The goal for the ML-screen was to reduce the number of patients requiring chart review without excluding potentially eligible patients.  The ML-screen was compared to alternative approaches using RAICD ≥1, RAICD ≥2, and RAICD ≥1+exclusion criteria features. To test whether the ML-screen can be successfully ported to other institutions, we trained at TCC and applied at CH, and vice versa.

Results: The current method reviewing all charts with RAICD≥1 yielded 346 (14.5%) eligible patients out of 2387 at TCC, and 74 (16.0%) out of 642 at CH.  Applying the ML-screen would result in reviewing 33% less patients in TCC and 44% less in CH, compared to RAICD ≥1, without screening out potentially eligible patients (Table 2).  In contrast, RAICD ≥2 high sensitivity 0.93-0.98, but did not reduce as many patients for chart review, 2.7-11.3%.  The RAICD ≥1+exclusion yielded a larger reduction of patients for review, 63-65%, however excluded approximately 22-27% of eligible patients. The ML-screen had similar performance when trained on one institution and tested on the other (Table 3).

Conclusion: The ML-screen incorporating EHR and NLP data can increase the efficiency of clinical trial recruitment by reducing the number of patients requiring chart review; importantly, this approach did not screen out eligible patients.  Moreover, the ML-screen can be trained at one institution and applied at another for multi-center clinical trials.


Table 1

Table 1. Features used in the ML-screen for clinical trial recruitment.


table2

Table 2. Comparison of performance between a screen developed using machine learning vs ICD only screens


Table 3

Table 3. Comparison of performance for MLS algorithm across institutions


Disclosure: T. Cai, None; F. Cai, None; K. Dahal, None; C. Hong, None; K. Liao, None.

To cite this abstract in AMA style:

Cai T, Cai F, Dahal K, Hong C, Liao K. Improving the Efficiency of Clinical Trial Recruitment Using Electronic Health Record Data, Natural Language Processing, and Machine Learning [abstract]. Arthritis Rheumatol. 2019; 71 (suppl 10). https://acrabstracts.org/abstract/improving-the-efficiency-of-clinical-trial-recruitment-using-electronic-health-record-data-natural-language-processing-and-machine-learning/. Accessed February 6, 2023.
  • Tweet
  • Email
  • Print

« Back to 2019 ACR/ARP Annual Meeting

ACR Meeting Abstracts - https://acrabstracts.org/abstract/improving-the-efficiency-of-clinical-trial-recruitment-using-electronic-health-record-data-natural-language-processing-and-machine-learning/

Advanced Search

Your Favorites

You can save and print a list of your favorite abstracts during your browser session by clicking the “Favorite” button at the bottom of any abstract. View your favorites »

ACR Pediatric Rheumatology Symposium 2020

© COPYRIGHT 2023 AMERICAN COLLEGE OF RHEUMATOLOGY

Wiley

  • Home
  • Meetings Archive
  • Advanced Search
  • Meeting Resource Center
  • Online Journal
  • Privacy Policy
  • Permissions Policies
  • Cookie Preferences