Session Information
Session Type: Poster Session A
Session Time: 10:30AM-12:30PM
Background/Purpose: Antiphospholipid syndrome (APS) is a complex autoimmune prothrombotic disorder that can present with venous or arterial thromboses, often masquerading as unprovoked pulmonary embolism (PE). The use of artificial intelligence (AI) to assist in diagnostic reasoning is expanding, but its precision in recognizing rare or overlapping disorders like APS remains underexplored.
Methods: We reviewed 958 PubMed entries related to pulmonary embolism and included only English-language case reports with full-text access, adequate clinical detail, and documented APS-related diagnostic testing. Fifty cases met the inclusion criteria. Each case was blinded to the final diagnosis and analyzed using GPT-4 (ChatGPT), which was instructed to list its top three differential diagnoses, assess the likelihood of APS, and indicate whether APS antibody testing was warranted. Final diagnoses were recorded separately for performance assessment.
Results: Among the 50 included cases, 7 had a confirmed diagnosis of APS. ChatGPT correctly identified APS in 4 of 7 cases (sensitivity 57%) and incorrectly suggested APS in 21 non-APS cases (specificity 51%). Overall diagnostic accuracy was 52%, with a positive predictive value (PPV) of 16%, a negative predictive value (NPV) of 88%, and an F1 score of 0.25.ChatGPT also correctly recommended APS antibody testing in all 7 APS cases and withheld testing in 19 of 43 non-APS cases, yielding a testing recommendation sensitivity of 100% and specificity of 44.2%. Additionally, in 73% of all cases, ChatGPT correctly included the final diagnosis—APS or otherwise—within its top three differential diagnoses, demonstrating consistent clinical reasoning across a wide range of thrombotic etiologies including hereditary thrombophilia, malignancy, infection, and anatomical anomalies. Performance by diagnosis category is summarized in Figure 1, and APS classification performance is illustrated in the confusion matrix (Figure 2).
Conclusion: ChatGPT-4 demonstrates potential in supporting differential diagnosis generation in complex PE cases. While it overcalls APS in some instances, it consistently identifies relevant thrombotic mechanisms and reliably flags cases warranting APS testing. These findings support the growing role of artificial intelligence in complex diagnostic workflows and highlight the importance of structured prompt design in enhancing model accuracy and clinical relevance.
Bar chart showing the percentage of cases in which ChatGPT correctly included the final diagnosis category within its top three differentials. Categories include APS, malignancy, hereditary thrombophilia, anatomical anomalies, COVID-related thrombosis, autoimmune (non-APS), and other causes.
Heatmap illustrating true positives, false positives, true negatives, and false negatives based on whether ChatGPT suggested APS in each case. The matrix is based on 50 PE case reports, 7 of which were confirmed to be APS.
To cite this abstract in AMA style:
Rabah S, Kang X. Evaluating Artificial Intelligence for Diagnosing Antiphospholipid Syndrome in Pulmonary Embolism Case Reports: A Prompt-Based Analysis [abstract]. Arthritis Rheumatol. 2025; 77 (suppl 9). https://acrabstracts.org/abstract/evaluating-artificial-intelligence-for-diagnosing-antiphospholipid-syndrome-in-pulmonary-embolism-case-reports-a-prompt-based-analysis/. Accessed .« Back to ACR Convergence 2025
ACR Meeting Abstracts - https://acrabstracts.org/abstract/evaluating-artificial-intelligence-for-diagnosing-antiphospholipid-syndrome-in-pulmonary-embolism-case-reports-a-prompt-based-analysis/