PhD Proposal: Genomic Data Analysis Using Secure Multi-Party Computation

Talk
Justin Wagner
Time: 
11.10.2015 15:00 to 16:30
Location: 

CBCB 3118

Statistical models that operate on patient biomarker data are a centerpiece of medical research. Currently deployed methods predict or identify the underlying causes of disease using a patient's physical attributes, family history, and genomic data. The power of these models depends on the quality and quantity of the data used for training. As demonstrated by recent work, many types of genomic data pose privacy concerns for patients. Designing and implementing systems to support secure sharing of biomarker data for model development and refinement is critical to realizing the goals of precision medicine.
In order to maximize the utility of genomic data and address privacy concerns I propose using secure multi-party computation for function evaluation over shared data sets. I will use garbled circuits to implement specific functionality including microbiome comparative analysis, normalization and clustering of sequence data, and managing patient consent along with researcher auditing. I will benchmark these implementations on datasets consisting of human DNA sequences and human microbiome sequence data. Finally, I will release open-source code of these analysis tools for use by the research community.
Examining Committee:
Committee Chair: - Dr. Hector Corrada Bravo
Dept's Representative - Dr. Amitabh Varshney
Committee Member(s): - Dr. Jonathan Katz
- Dr. Mihai Pop