PhD Proposal: Algorithms for efficient pan-genome representation of a large number of whole genomes for population genomics

Kiran Javkar
01.15.2021 13:00 to 15:00


The rapid improvements in sequencing technologies and their decreasing costs have revolutionized the field of genomics, and have furthered our understanding of the microbiome. The ability to compare entire genomes of a large number of isolates gave rise to the field of population genomics, where the emphasis is on large-scale genomic analyses. Such analyses have been instrumental in discovering the genomic drivers of important microbial characteristics, including antimicrobial resistance and pathogenicity. As the availability of whole genome sequence data continues to increase, the bottleneck has now shifted onto its analysis and requires the development of better tools. In my work, I concentrate on the challenges and solutions with regards to the population genomic analyses of whole genomes. One of the most common approaches in population genomics involves ‘pan-genome’ representation, which would describe the union of sequence entities of the given genomes. The existing alternatives for this pan- genome representation do not cater well to the needs of population genomic analyses—either due to scalability issues or the magnitude of the genomic features to be evaluated. I present my tool, PRAWNS, which generates an efficient pan-genome of the given closely related genomes by deploying a novel approach for extracting a concise list of genomic features or sequence entities of interest taking into account their utility for biological applications. I present the scalability of PRAWNS to operate on a large number of genomes with limited computing resources and the usability of its output for downstream assessments. Lastly, I describe the future directions for my work, which include extensions to my current work as well as potential solutions to support population genomics in metagenomics. With the increasing availability of sequenced genomes and improvements in sequencing and assembly approaches, these algorithms and tools would boost biological discoveries.Examining Committee:

Chair: Dr. Mihai Pop Dept rep: Dr. John Dickerson Members: Dr. Hugh Rand Dr. Rob Patro