PhD Proposal: Multi-omics Approaches to Characterize Heterogeneity in Gene Regulation across Tissues and Complex Diseases

Talk
Mahfuza Sharmin
Time: 
12.04.2015 14:00 to 15:30
Location: 

CBCB 3118

Within multicellular organisms, hundreds of morphologically distinct cell types are highly organized to perform specific functions. Tissue-specific morphology and functions are, largely mediated by the tissue-specific transcriptional programs. The spatio-temporal regulation of gene transcription, in turn, is a highly complex and organized process, mediated mainly by transcription factor proteins (whose context-specific activity itself is regulated), as well as the epigenomic context such as DNA methylation and various chemical modifications of histones that comprise the nucleosome particles around which the DNA is wrapped around in the nucleus.
A mechanistic understanding of transcriptional program is foundational to our understanding of, for instance, what makes the heart a heart and what makes the brain a brain, during development. However, such understanding has also tremendous importance to comprehend various mechanisms in the disease context, e.g. the uncontrolled cell division and loss of apoptosis in cancer, abnormal response by the immune cells leading to autoimmune diseases, etc.
Recent advance in sequencing technology has yielded, and continues to yield, unprecedented amounts of genomic, transcriptomic, and epigenomic data. Besides their size, the additional challenges in fully harnessing these data include the variety of the data sources, the variety of experimental conditions, the variety of experimental techniques. For example, the NIH Roadmap Epigenetics Mapping Consortium provides, in primary tissues, measure of various epigenomic modifications in different tissues, the Encyclopedia of DNA Elements (ENCODE) includes the measurements, in cell lines, of various functional elements which includes regulatory elements, elements acting on protein and RNA levels, The Cancer Genome Atlas (TCGA) catalogues the estimates of genetic changes like somatic mutation, copy number variation, as well as epigenomic changes such as methylation across various cancer types and the matched normal cells. Many of these data are same measurement in multiple conditions (tissue or time of cell history), which can be thought of as 'snapshot' of same cellular process from different angle. Integrating these 'snapshot' stemming from multiple contexts is likely to provide clues about the underlying mechanism and its plausible heterogeneous nature.
In my thesis, I propose to decrypt the heterogeneous nature of regulatory pattern in the context of both multiple tissue as well as complex diseases. In this regard, I introduce a novel framework to detect heterogeneity by mapping the data into multiple classification tasks and clustering multidimensional data generated by those classification tasks. By applying this novel framework, for the first time, we have been able to identify both tissue specific and tissue independent models governing DNA-protein interactions. Based on the initial success of this novel method from biological validation, we propose to utilize our framework further in disease context, more specifically, how the heterogeneity of DNA-protein interaction/regulation and the CGI methylation influences different types of cancer, and gene expression heterogeneity across tissues shape the nature of a complex disease.
Examining Committee:
Committee Chair: - Dr. Sridhar Hannenhalli
Co-Chair: - Dr. Hector Corrada Bravo
Dept's Rept. - Dr. Hal Daume III