Algorithms for genome and metagenome assembly using long reads

Mikhail Kolmogorov
Talk Series: 
02.03.2022 14:00 to 15:00

IRB 4105

Also on Zoom- Long-read sequencing technologies have substantially improved our ability to study large and complex genomes. However, de novo assembly of complex genomic and metagenomic datasets remains difficult. In this talk, I will give an algorithmic overview of the genome assembly problem. I will also highlight our Flye assembler that uses repeat graphs to generate accurate and complete assemblies. Finally, I will also present our new metagenomic assembler metaFlye, which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. Using metaFlye, we were able to recover complete or nearly-complete bacterial genomes from complex environmental samples, such as human gut or cow rumen. We also showed that long-read assembly of human microbiomes enables the discovery of full-length biosynthetic gene clusters that encode biomedically important natural products.