Session Type: Poster Session (Monday)
Session Time: 9:00AM-11:00AM
Background/Purpose: Electronic health records (EHRs) from large health care systems provide access to rich and comprehensive patient-specific information from many sources consisting of heterogeneous data types. The unique features and challenges of EHR data, including missing information and non-linear interactions, require novel statistical approaches for analyses. Bayesian Networks (BNs) provide a relatively new method of representing uncertain relationships among variables and here we explore their ability to learn the structure of MSK symptoms related to the development of PsA in people with psoriasis.
Methods: Incident cases of psoriasis were identified between 1998 and 2015 from the UK Clinical Research Practice Datalink (CPRD). MSK symptoms occurring during the study period were identified based on medcodes and were checked by a physician to eliminate redundant MSK symptoms. Baseline demographics for gender, age, body mass index (BMI), psoriasis severity, alcohol use and smoking status were also extracted. The BN structure was composed using a combination of expert knowledge and data-oriented modeling with several methods compared to obtain a BN structure which best described the relationships between the variables. Bayesian inference was used to compute the posterior distribution of network weights, which quantify the strength of these relationships. The BN model was evaluated using well-established performance metrics.
Results: Over one million MSK symptoms were extracted for the 90,189 incident cases of psoriasis identified, of which 1409 developed PsA. These consisted of 379 unique medcodes which were concatenated into one of six categories (pain, deformity, inflammation, stiffness, swelling or fatigue). The graphical representation of the BN structure in Figure 1 shows widespread probabilistic associations between the 12 variables included in the modelling. Nine were identified as direct predecessors of PsA. While the remaining three variables did not influence the PsA directly, they did influence PsA through their respective child nodes. For example, age and fatigue influence PsA through their common child node, swelling. Our BN was 81% accurate in predicting the development of PsA in a validation set. The AUC for this predictive model was 0.85 (95% confidence interval (CI): 0.83-0.87), translating into 81% sensitivity and 89% specificity.
Conclusion: The presented BN model considers the demographics and MSK symptoms of people with psoriasis and can be used as a useful method to predict the development of PsA with reasonable accuracy. It provides useful information to clinicians, such as the probabilistic relations among variables of interests that associate with individuals at increased risk of developing PsA. In addition to offering both modelling flexibility and statistical validity, our technique seamlessly handles missing data and offers the opportunity to combine findings from the medical literature with clinical judgement to shape the model. Important improvements and future developments to the BN model would a) broaden the MSK symptom categories and b) extend the variable set by including addition EHR data such as tests, prescriptions and referrals.
To cite this abstract in AMA style:Green A, Smith T, McHugh N. Learning the Relationships Between Psoriatic Arthritis and a Patient’s History of Musculoskeletal Symptoms from Electronic Health Records Using Bayesian Networks [abstract]. Arthritis Rheumatol. 2019; 71 (suppl 10). https://acrabstracts.org/abstract/learning-the-relationships-between-psoriatic-arthritis-and-a-patients-history-of-musculoskeletal-symptoms-from-electronic-health-records-using-bayesian-networks/. Accessed February 3, 2023.
« Back to 2019 ACR/ARP Annual Meeting
ACR Meeting Abstracts - https://acrabstracts.org/abstract/learning-the-relationships-between-psoriatic-arthritis-and-a-patients-history-of-musculoskeletal-symptoms-from-electronic-health-records-using-bayesian-networks/