ACR Meeting Abstracts

ACR Meeting Abstracts

  • Meetings
    • ACR Convergence 2024
    • ACR Convergence 2023
    • 2023 ACR/ARP PRSYM
    • ACR Convergence 2022
    • ACR Convergence 2021
    • ACR Convergence 2020
    • 2020 ACR/ARP PRSYM
    • 2019 ACR/ARP Annual Meeting
    • 2018-2009 Meetings
    • Download Abstracts
  • Keyword Index
  • Advanced Search
  • Your Favorites
    • Favorites
    • Login
    • View and print all favorites
    • Clear all your favorites
  • ACR Meetings

Abstract Number: 1973

Leveraging Publicly Available Gene Expression Data and Applying Machine Learning to Identify Novel Biomarkers for Rheumatoid Arthritis

Dmitry Rychkov1, Marina Sirota2 and Cindy Lin3, 1Institute for Computational Health Sciences, University of Calfornia, San Francisco, San Francisco, CA, 2Pediatrics, Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, CA, 3Stanford, Stanford, CA

Meeting: 2018 ACR/ARHP Annual Meeting

Keywords: Biomarkers, data analysis and synovium, Gene Expression

  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
Session Information

Date: Tuesday, October 23, 2018

Title: Genetics, Genomics and Proteomics Poster

Session Type: ACR Poster Session C

Session Time: 9:00AM-11:00AM

Background/Purpose: Diagnosis and monitoring the disease progression of RA is challenging requiring a combination of imaging techniques and blood tests. There is currently no biochemical test for detection of early-stage disease. In this study, we aimed to define a Rheumatoid Arthritis meta-profile and identify biomarkers by leveraging publicly available gene expression data with machine learning approaches.

Methods: We carried out a comprehensive search for publicly available microarray data at NCBI GEO database for whole blood and synovial tissue in Rheumatoid Arthritis and health controls. For the synovium, we collected 13 datasets with 312 biopsy samples. Among them, there were 276 RA samples and 36 healthy tissue biopsies. For whole blood data, we collected 11 datasets with 2,153 samples: 1,394 RA and 759 healthy controls. We computed differential expression using Significance Analysis of Microarrays (SAM) approach. We applied the cutoff of FDR < 0.05 and abs(FC) > 1.2 to the results to identify significant differentially expressed genes. For pathway analysis we leveraged the gene list enrichment analysis tool ToppGene.

Results: As a result of our analysis we were able to identify 882 genes that were significantly differentially expressed in the synovium between RA patients and healthy controls. Among them we recognized 502 up-regulated and 380 down regulated genes. We confirmed the gene regulation of the immune system process and response, and cell activation and aggregation in both innate and adaptive immune system pathways were involved in RA. As for the whole blood data, we identified 339 significantly differentially expressed genes with 166 up-regulated and 173 down-regulated genes among them. Aiming to determine RA biomarkers we performed a machine learning feature selection procedure to sets of significant genes for both tissues. First, we filtered out genes that cumulatively contribute to the biological variance less than 5%. Then we applied a Variable Selection Using Random Forests (VSURF) approach to the leftovers. Next, we performed a hypergeometric test and found 12 common genes with p = 0.001 with 3 common up-regulated genes: Antigen peptide transporter 1 (TAP1), Matrix Metallopeptidase 9 (MMP9), and DNA Damage Regulated Autophagy Modulator 1 (DRAM1), and 2 common down-regulated genes: DDX3Y, MYC. Finally, we built a Random Forest classification model on the synovium data with these 5 genes. We applied 5-fold cross-validation with 10 repeats technique and used Cohen’s Kappa statistic as a metric. We obtained Kappa equals 0.61 with sensitivity 0.86 and specificity 0.9 on the testing set. In the final step, we validated the prediction model on the whole blood data, resulting kappa of 0.57 with sensitivity 0.54 and specificity 0.98.

Conclusion: Our computational analysis of public data allowed us to perform a comprehensive in-silico search for biomarkers in Rheumatoid Arthritis. We found three protein coding genes that have the strongest association with RA. Identification of extensive proteins secretion in blood could allow precision phenotyping on even early stages of the disease which could have a positive impact on monitoring disease progression and patient treatment.


Disclosure: D. Rychkov, None; M. Sirota, None; C. Lin, None.

To cite this abstract in AMA style:

Rychkov D, Sirota M, Lin C. Leveraging Publicly Available Gene Expression Data and Applying Machine Learning to Identify Novel Biomarkers for Rheumatoid Arthritis [abstract]. Arthritis Rheumatol. 2018; 70 (suppl 9). https://acrabstracts.org/abstract/leveraging-publicly-available-gene-expression-data-and-applying-machine-learning-to-identify-novel-biomarkers-for-rheumatoid-arthritis/. Accessed .
  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print

« Back to 2018 ACR/ARHP Annual Meeting

ACR Meeting Abstracts - https://acrabstracts.org/abstract/leveraging-publicly-available-gene-expression-data-and-applying-machine-learning-to-identify-novel-biomarkers-for-rheumatoid-arthritis/

Advanced Search

Your Favorites

You can save and print a list of your favorite abstracts during your browser session by clicking the “Favorite” button at the bottom of any abstract. View your favorites »

All abstracts accepted to ACR Convergence are under media embargo once the ACR has notified presenters of their abstract’s acceptance. They may be presented at other meetings or published as manuscripts after this time but should not be discussed in non-scholarly venues or outlets. The following embargo policies are strictly enforced by the ACR.

Accepted abstracts are made available to the public online in advance of the meeting and are published in a special online supplement of our scientific journal, Arthritis & Rheumatology. Information contained in those abstracts may not be released until the abstracts appear online. In an exception to the media embargo, academic institutions, private organizations, and companies with products whose value may be influenced by information contained in an abstract may issue a press release to coincide with the availability of an ACR abstract on the ACR website. However, the ACR continues to require that information that goes beyond that contained in the abstract (e.g., discussion of the abstract done as part of editorial news coverage) is under media embargo until 10:00 AM ET on November 14, 2024. Journalists with access to embargoed information cannot release articles or editorial news coverage before this time. Editorial news coverage is considered original articles/videos developed by employed journalists to report facts, commentary, and subject matter expert quotes in a narrative form using a variety of sources (e.g., research, announcements, press releases, events, etc.).

Violation of this policy may result in the abstract being withdrawn from the meeting and other measures deemed appropriate. Authors are responsible for notifying colleagues, institutions, communications firms, and all other stakeholders related to the development or promotion of the abstract about this policy. If you have questions about the ACR abstract embargo policy, please contact ACR abstracts staff at [email protected].

Wiley

  • Online Journal
  • Privacy Policy
  • Permissions Policies
  • Cookie Preferences

© Copyright 2025 American College of Rheumatology