ACR Meeting Abstracts

ACR Meeting Abstracts

  • Meetings
    • ACR Convergence 2024
    • ACR Convergence 2023
    • 2023 ACR/ARP PRSYM
    • ACR Convergence 2022
    • ACR Convergence 2021
    • ACR Convergence 2020
    • 2020 ACR/ARP PRSYM
    • 2019 ACR/ARP Annual Meeting
    • 2018-2009 Meetings
    • Download Abstracts
  • Keyword Index
  • Advanced Search
  • Your Favorites
    • Favorites
    • Login
    • View and print all favorites
    • Clear all your favorites
  • ACR Meetings

Abstract Number: 1195

NLP-Based Clustering Methods Can Efficiently Categorize Scientific Abstracts for Medical Conferences

Jeffrey Curtis1, yujie Su2, Fenglong Xie2, cassie Clinton2, Janet Pope3, Vivian Bykerk4, Kenneth Saag2, Josef Smolen5, Dan Furst6 and Lauren Davis7, 1Division of Clinical Immunology and Rheumatology, Department of Medicine, Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, 2University of Alabama at Birmingham, Birmingham, AL, 3University of Western Ontario, London, ON, Canada, 4Division of Rheumatology, Hospital for Special Surgery, New York City, NY, 5Medical University of Vienna, Vienna, Austria, 6University of California Los Angeles, Los Angeles, CA, 7American College of Rheumatology, Atlanta, GA

Meeting: ACR Convergence 2021

Keywords: Education

  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
Session Information

Date: Monday, November 8, 2021

Title: Professional Education Poster (1170–1195)

Session Type: Poster Session C

Session Time: 8:30AM-10:30AM

Background/Purpose: Expanding scientific discovery has resulted in increased challenges to organize and categorize medical knowledge. Dozens or hundreds of abstracts are submitted each year to more than 40 American College of Rheumatology (ACR) Convergence Annual Meeting Abstract Submission Categories. Numerous ACR abstract review teams then are charged to group similar types of abstracts into the many poster and oral sessions (i.e. subcategories), to be presented across Convergence meeting days. This manual process of grouping relies on subject matter expertise to identify similar content across abstracts and is exceedingly time consuming. We developed and implemented an automated approach to sub-categorize similar abstracts within each ACR Abstract Submission Category to increase the efficiency of the grouping process.

Methods: The corpus of all accepted abstracts to seven of the largest 2020 ACR Convergence Abstract Categories was parsed and processed using natural language processing (NLP) tools from the National Library of Medicine. After filtering stop words, parsing the data, and applying n-grams for tokenization, a bag of words approach was used to identify all terms and multi-word concepts in both the title and body of all abstracts. We counted term and concept frequencies, weighted by the inverse of their frequency across all abstracts. Concepts also were tagged with their semantic type using UMLS. K-means clustering was used to derive abstract category subclusters, optimizing the cluster convergence criterion (CCC) metric to identify the optimal number of subclusters (i.e. abstract session subcategories). An iterative, automated approach subsequently was applied that required the clustering algorithm to select the next number of clusters (also based on the CCC) if the first solution did not meet constraints defined by varying parameters on the size of the subcategories (i.e. min/max number of clusters, and the min/max number of abstracts per cluster).

Results: A total of 840 abstracts distributed across 7 ACR 2020 Convergence categories were analyzed and yielded 156,778 unique concepts derived from 24,990 unique terms. For each of the 7 Abstract Submission Categories and with no constraints applied, the method yielded 6 – 14 subcategory clusters. The min and max number of abstracts per cluster subcategory ranged from 6 – 18 (Table). Applying additional constraints on both the number of clusters and min/max number of abstracts per cluster yielded convergence within 1-4 iterations.

Conclusion: Clustering methods combined with NLP tools has the ability to greatly reduce the time spent by ACR Convergence meeting review teams and has applicability to other scientific meetings to automatically subgroup abstracts into sessions or to pre-categorize them as a basis for further manual refinement.


Disclosures: J. Curtis, AbbVie, 2, Amgen, 2, 5, Bristol-Myers Squibb, 2, Janssen, 2, Eli Lilly, 2, Myriad, 2, Pfizer Inc, 2, 5, Roche/Genentech, 2, UCB, 2, CorEvitas, 2, 5, Crescendo Bio, 5; y. Su, None; F. Xie, None; c. Clinton, None; J. Pope, AbbVie, 2, Amgen, 2, Bayer, 2, Bristol-Myers Squibb, 2, 5, Eli Lilly, 2, Merck, 2, Novartis, 2, Pfizer Inc, 2, Roche, 2, 5, Sanofi, 2, Seattle Genetics, 5, UCB, 2, 5, Actelion, 2, Sandoz, 2; V. Bykerk, National Institutes of Health, 1, 5, Canadian Institutes of Health Research, 5, Amgen, 2, 5, BMS, Celgene, 2, 6, Gilead, 2, Sanofi, 2, 6, Regeneron, 2, Eli Lilly and Company, 6, Pfizer, 6, UCB, 6; K. Saag, Arthrosi, 2, Atom Bioscience, 2, Horizon Therapeutics, 2, 5, LG Pharma, 2, Mallinkrodt, 2, SOBI, 2, 5, Takeda, 2, Shanton, 5; J. Smolen, AbbVie, 2, 5, BMS, 2, 5, Celegene, 2, 5, Chugai, 2, 5, Gilead, 2, 5, Janssen, 2, 5, Eli Lilly, 2, 5, MSD, 2, 5, Novartis-Sandoz, 2, 5, Pfizer, 2, 5, Roche, 2, 5, Samsung, 2, 5, Sanofi, 2, 5, UCB, 2, 5; D. Furst, Actelion, 2, 5, Amgen, 2, 5, BMS, 2, 5, Corbus, 2, 6, Galapagos, 2, 5, GSK, 6, Sanofi, 2, 5, 6, Roche/Genentech, 5, National Institutes of Health, 5, Novartis, 2, 5, Pfizer, 2, 5; L. Davis, None.

To cite this abstract in AMA style:

Curtis J, Su y, Xie F, Clinton c, Pope J, Bykerk V, Saag K, Smolen J, Furst D, Davis L. NLP-Based Clustering Methods Can Efficiently Categorize Scientific Abstracts for Medical Conferences [abstract]. Arthritis Rheumatol. 2021; 73 (suppl 9). https://acrabstracts.org/abstract/nlp-based-clustering-methods-can-efficiently-categorize-scientific-abstracts-for-medical-conferences/. Accessed .
  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print

« Back to ACR Convergence 2021

ACR Meeting Abstracts - https://acrabstracts.org/abstract/nlp-based-clustering-methods-can-efficiently-categorize-scientific-abstracts-for-medical-conferences/

Advanced Search

Your Favorites

You can save and print a list of your favorite abstracts during your browser session by clicking the “Favorite” button at the bottom of any abstract. View your favorites »

All abstracts accepted to ACR Convergence are under media embargo once the ACR has notified presenters of their abstract’s acceptance. They may be presented at other meetings or published as manuscripts after this time but should not be discussed in non-scholarly venues or outlets. The following embargo policies are strictly enforced by the ACR.

Accepted abstracts are made available to the public online in advance of the meeting and are published in a special online supplement of our scientific journal, Arthritis & Rheumatology. Information contained in those abstracts may not be released until the abstracts appear online. In an exception to the media embargo, academic institutions, private organizations, and companies with products whose value may be influenced by information contained in an abstract may issue a press release to coincide with the availability of an ACR abstract on the ACR website. However, the ACR continues to require that information that goes beyond that contained in the abstract (e.g., discussion of the abstract done as part of editorial news coverage) is under media embargo until 10:00 AM ET on November 14, 2024. Journalists with access to embargoed information cannot release articles or editorial news coverage before this time. Editorial news coverage is considered original articles/videos developed by employed journalists to report facts, commentary, and subject matter expert quotes in a narrative form using a variety of sources (e.g., research, announcements, press releases, events, etc.).

Violation of this policy may result in the abstract being withdrawn from the meeting and other measures deemed appropriate. Authors are responsible for notifying colleagues, institutions, communications firms, and all other stakeholders related to the development or promotion of the abstract about this policy. If you have questions about the ACR abstract embargo policy, please contact ACR abstracts staff at [email protected].

Wiley

  • Online Journal
  • Privacy Policy
  • Permissions Policies
  • Cookie Preferences

© Copyright 2025 American College of Rheumatology