A Longitudinal Multi-omics Examination of β-Amyloid Deposition and Cognitive Decline in the Years Prior to Alzheimer’s Disease" (BD); "High-Throughput Machine Learning from Electronic Health Records (RK)

Tuesday, January 31, 2017 -
4:00pm to 5:00pm
Room 1360 Biotechnology Center, 425 Henry Mall

Speaker Name: 

Burcu Darst and Ross Kleiman (Population Health Sciences [BD]; Computer Sciences [RK])

Speaker Institution: 

University of Wisconsin-Madison




Abstract (B. Darst):
Although Alzheimer’s disease (AD) is highly heritable, few genetic or environmental factors have been identified that are associated with the disease, or with β-amyloid (Aβ) deposition or cognitive decline. The latter two characteristics show notable changes in the years prior to AD diagnosis, and it is critical to have an understanding of the underlying biological mechanisms that lead to these changes in order to better prevent, diagnose, and treat the disease. A longitudinal multi-omics examination integrating genomic and metabolomic data, the latter of which is influenced by both biological and environmental factors, could allow for more thorough and comprehensive modeling of Aβ deposition and cognitive decline. This talk will focus on the integration of genomic and metabolomic data in a substudy of the Wisconsin Registry for Alzheimer’s Prevention, as well as methods proposed to perform these integrated analyses.

Abstract (R. Kleiman):
The use of Electronic Health Record (EHR) systems has increased dramatically in recent years. This vast digitization of medical data allows for new ways to predict diseases that were not possible with paper charts. While prior work has focused on predicting individual diseases, our research builds thousands of models to predict nearly every diagnosis (ICD-9 code) a patient could receive. This high-throughput machine learning approach yields inference on the health landscape of both individual patients and patient populations. Integral in our approach is the use of a dynamic control matching scheme that, for each diagnosis, automatically selects appropriate case and control patients using minimal hand tuning. Across the nearly 4,000 models, we observe a mean AUC of 0.803 predicting 1 month prior to diagnosis, and a mean AUC of 0.758 predicting 6 months prior to diagnosis. We additionally explore constructing models to predict diseases 2-, 5-, 10-, 15-, and 20-years in advance. Furthermore, we break down our results across 15 major disease categories including pregnancy complications and diseases of the circulatory system. This work opens a potential pathway to pan-diagnostic decision support. Instead of only targeting a small number of well understood diseases, this research shows machine learning techniques can be used to help predict the broad spectrum of diagnoses a patient may receive.