ACR Meeting Abstracts

ACR Meeting Abstracts

  • Meetings
    • ACR Convergence 2025
    • ACR Convergence 2024
    • ACR Convergence 2023
    • 2023 ACR/ARP PRSYM
    • ACR Convergence 2022
    • ACR Convergence 2021
    • 2020-2009 Meetings
    • Download Abstracts
  • Keyword Index
  • Advanced Search
  • Your Favorites
    • Favorites
    • Login
    • View and print all favorites
    • Clear all your favorites
  • ACR Meetings

Abstract Number: 1914

Performance Comparison of Artificial Intelligence tools ChatGPT, Bing AI, and Google Bard for Clinical Rheumatology Decision Support: When AI Talks Rheumatology

Aakanksha Pitliya1, Hema Latha Anam2, Richard Oletsky2, Alexandra Georgiana Boc2, Dipabali Chaudhuri2 and Rajesh Thirumaran2, 1Mercy Catholic Medical Center, Darby, PA, Clifton Heights, PA, 2Mercy Catholic Medical Center, Darby, PA

Meeting: ACR Convergence 2025

Keywords: Access to care, comparative effectiveness

  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
Session Information

Date: Tuesday, October 28, 2025

Title: (1914–1935) Health Services Research Poster III

Session Type: Poster Session C

Session Time: 10:30AM-12:30PM

Background/Purpose: Artificial intelligence (AI) has shown promise as a tool to assist in clinical decision-making. Given the complex nature of autoimmune pathologies and the critical need for guideline-based precision in management, the relative efficiency of AI tools in answering specific rheumatology questions remains undetermined. This study aimed to evaluate the performance of three leading AI language models: ChatGPT (OpenAI), Bing AI (Microsoft), and Google Bard in responding to clinical questions in rheumatology, focusing on key domains of accuracy, relevance, response quality, and timeliness.

Methods: Physician reviewers structured a panel of 50 clinical questions with a rheumatology focus. Each question was submitted independently in plain text to Google Bard, Bing AI, and ChatGPT. Responses were anonymized and rated on a 5-point Likert scale for quality (clarity, completeness, and educational value), accuracy (clinical correctness and factual precision), and relevance (appropriateness and applicability to the question asked) by blinded Internal Medicine Residents to avoid bias. Response latency (in seconds) was recorded for each model via stopwatch timing from prompt submission to completion of response. Descriptive statistics were calculated, and one-way ANOVA was performed to detect significant differences across the models. Principal Component Analysis (PCA) was used for multivariate clustering to explore underlying patterns and potential clustering of model responses. All statistical analyses were conducted using R (v4.3.2). A p-value < 0.05 was considered statistically significant.

Results: ChatGPT scored highest in accuracy (mean ± SD: 4.36 ± 0.36) and relevance (4.64 ± 0.17), with statistically significant differences compared to Bing and Bard (ANOVA p < 0.001 for both). While ChatGPT also led in response quality (4.10 ± 1.10), the difference was not statistically significant(p = 0.064)*. Timeliness varied significantly among models, with ChatGPT responding fastest (19.4 ± 5.19s), followed by Bing (23.5 ± 5.54s) and Bard (28.0 ± 6.50s). PCA demonstrated that ChatGPT exhibited unique multidimensional performance by clustering independently.

Conclusion: Among the three AI systems evaluated, ChatGPT performed most proficiently in answering clinical rheumatology questions, with considerable improvements in accuracy, relevance, and timeliness. According to our research, generative AI models may be used as a supplement to clinical judgment in rheumatology, particularly in instances when quick and accurate synthesis of complicated data is essential.

Supporting image 1Comparison of AI Model Performance on Clinical Rheumatology Questions Using Likert Scale Metrics.

Supporting image 2Principal Component Analysis visualization of the AI models based on their multidimensional performance across accuracy, relevance, response quality, and timeliness.

Supporting image 3Likert scale metrics are scored on a 1–5 scale.


Disclosures: A. Pitliya: None; H. Anam: None; R. Oletsky: None; A. Boc: None; D. Chaudhuri: None; R. Thirumaran: None.

To cite this abstract in AMA style:

Pitliya A, Anam H, Oletsky R, Boc A, Chaudhuri D, Thirumaran R. Performance Comparison of Artificial Intelligence tools ChatGPT, Bing AI, and Google Bard for Clinical Rheumatology Decision Support: When AI Talks Rheumatology [abstract]. Arthritis Rheumatol. 2025; 77 (suppl 9). https://acrabstracts.org/abstract/performance-comparison-of-artificial-intelligence-tools-chatgpt-bing-ai-and-google-bard-for-clinical-rheumatology-decision-support-when-ai-talks-rheumatology/. Accessed .
  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print

« Back to ACR Convergence 2025

ACR Meeting Abstracts - https://acrabstracts.org/abstract/performance-comparison-of-artificial-intelligence-tools-chatgpt-bing-ai-and-google-bard-for-clinical-rheumatology-decision-support-when-ai-talks-rheumatology/

Advanced Search

Your Favorites

You can save and print a list of your favorite abstracts during your browser session by clicking the “Favorite” button at the bottom of any abstract. View your favorites »

Embargo Policy

All abstracts accepted to ACR Convergence are under media embargo once the ACR has notified presenters of their abstract’s acceptance. They may be presented at other meetings or published as manuscripts after this time but should not be discussed in non-scholarly venues or outlets. The following embargo policies are strictly enforced by the ACR.

Accepted abstracts are made available to the public online in advance of the meeting and are published in a special online supplement of our scientific journal, Arthritis & Rheumatology. Information contained in those abstracts may not be released until the abstracts appear online. In an exception to the media embargo, academic institutions, private organizations, and companies with products whose value may be influenced by information contained in an abstract may issue a press release to coincide with the availability of an ACR abstract on the ACR website. However, the ACR continues to require that information that goes beyond that contained in the abstract (e.g., discussion of the abstract done as part of editorial news coverage) is under media embargo until 10:00 AM CT on October 25. Journalists with access to embargoed information cannot release articles or editorial news coverage before this time. Editorial news coverage is considered original articles/videos developed by employed journalists to report facts, commentary, and subject matter expert quotes in a narrative form using a variety of sources (e.g., research, announcements, press releases, events, etc.).

Violation of this policy may result in the abstract being withdrawn from the meeting and other measures deemed appropriate. Authors are responsible for notifying colleagues, institutions, communications firms, and all other stakeholders related to the development or promotion of the abstract about this policy. If you have questions about the ACR abstract embargo policy, please contact ACR abstracts staff at [email protected].

Wiley

  • Online Journal
  • Privacy Policy
  • Permissions Policies
  • Cookie Preferences

© Copyright 2025 American College of Rheumatology