Learning and Summarization of Complex and Large Datasets with Graphical Models: An Application in Healthcare Data Analytics
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Over the past few decades, the rise of multiple chronic conditions (MCC) has become a major concern for clinicians. More than a quarter of all Americans, and two-thirds of older Americans, are estimated to have MCC. The emergence of MCC adds complexity to managing patient healthcare design, care, and cost. It has been identified as one of the major challenges of recent times. However, it is still not known precisely how multiple chronic conditions emerge among patients. Most of the literature on chronic conditions focuses on individual diseases such as Diabetes, Traumatic Brain Injury, Substance Abuse, Cardiovascular diseases, Chronic Kidney Conditions, etc. However, it is evident from the literature that most patients suffering from a chronic condition will eventually develop other conditions (i.e., MCC). Additionally, modifiable lifestyle behaviors, such as diet, exercise, alcohol use etc., and socio-demographic factors such as age group, gender, education level also contribute to the development and progression of MCC. What is not precisely known are the dynamic effects of an individual's prior conditions, risk factors, and behavioral lifestyle changes on the trajectories of MCC emergence. With the advancement of medical technology, the amount of data collected from different electronic medical records systems is increasing. Thus, such disease interaction graphs are becoming larger and more complex. Hence, it is required to have an efficient way to analyze and manage MCC that uses real-time decision making in a big data setting. In this work, we propose several algorithms which together represents an end-to-end approach to (i) capture the patterns of MCC development and progression; (ii) analyze the impact of socio-demographic and modifiable risk factors on MCC; (iii) early detection of new chronic conditions due to the impact of these risk factors; and (iv) provide probable intervention strategies to control and possibly mitigate new conditions from emerging. We validated our proposed models utilizing several deidentified patient data. These data were collected from the Veterans Affairs, The Center on Smart and Connected Health Technologies at UT Health San Antonio, and Cameron County Hispanic Cohort (CCHC) dataset.