Instructor: Dr. Zia Khan (my first
name)(at symbol)cs.umd.edu
Day and Time:
Tuesday 3:30pm-4:45pm
Thursday 3:30pm-4:45pm
Location: Computer Science Instructional Center (CSIC) Room 3120
Instructor Office Hours: By appointment.
Teaching Assistant: Josh Bradley, jgbrad1(at)cs.umd.edu
TA Office Hours: TBD
CMSC702 Piazza Page
Major advances in technology for genomic studies are bringing the prospect of personalized and individualized medicine closer to reality. Many of these advances are predicated on the ability to generate DNA sequencing data at an unprecedented rate, posing a significant need for computational data analysis that is clinically and biologically useful and robust. This course will concentrate on the fundamental computational and statistical methods required to meet this need. It will cover topics in functional genomics, population genetics and epigenetics. Machine learning methods will be a core component of this class. No prior knowledge of biology is required. Image source: Nature 518, 337–343 (19 February 2015).
The grade distribution is below
10% class participation and attendance
35% reading quizzes
35% final project
20% take-home final exam
Grades will be available on grades.cs.
Please read the UMDCP attendance policy. Please do email me if you find yourself falling behind in class. I’m here to help.
Maintaining your reputation is critical in science. Please read the UMDCP academic integrity policy.
I encourage you to provide me with input and feedback during the semester. At the end of the semester, you'll be asked provide formal evaluations at www.courseevalum.umd.edu. Your evaluations will be used to improve the class in following years.
The schedule will be updated frequently over the semester.Please check back regularly. TBD designates "to be determined." Note that some of the readings are quite challenging and lecture is designed to help. My expectation is that you've taken a first shot at understanding the computational concepts underlying the reading material.
td> td> td>Date | Lecture Topic + Slides | Associated Reference Material | Quiz |
---|---|---|---|
Week 1 | |||
1/26 | CAMPUS CLOSED DUE TO SNOW | ||
1/28 | L1:Intro to DNA Slides/Notes Illumina Sequencing Video Single Cell Sequencing Video Flow Cell Video PacBio SMRT Sequencing Video Nanopore Sequencing Video |
Wikipedia: DNA Cell Snapshot: HTSeq Applications Illumina Intro |
L1, Survey |
Week 2 | |||
2/2 | L2:The Genomic Data Deluge Slides/Notes |
DOI: 10.1371/journal.pbio.1002195 | L2, Big Data Qs |
2/4 | L3:Genome Assembly Using Minimap and miniasm Slides/Notes |
arXiv:1512.01801 | L3 Quiz |
Week 3 | |||
2/9 | L4:BWA Alignment + SAM Format Slides/Notes |
doi:
10.1093/bioinformatics/btp324 SAMv1 Spec |
L4 Quiz |
2/11 | L5:CRAM data compression Slides/Notes DEADLINE TO FORM GROUPS OF 1-3 |
doi: 10.1101/gr.114819.110 CRAMv3 Spec |
L5 Quiz |
Week 4 | |||
2/16 | L6:Bayesian Variant Detection + VCF Format Slides/Notes |
FreeBayes Draft Paper VCFv4.2 Spec |
L6 Quiz |
2/18 | L7:Short Tandem Repeats Slides/Notes |
doi: 10.1101/gr.135780.111 | L7 Quiz |
Week 5 | |||
2/23 | NO CLASS INSTRUCTOR AT MEETING | ||
2/25 | L8:Structural Variation Slides/Notes |
doi:10.1093/bioinformatics/bts378 doi: 10.1101/gr.114876.110. |
L8 Quiz |
Week 6 | |||
3/1 | L9:Genotype Query Slides/Notes |
doi:10.1038/nmeth.3654 | L9 Quiz |
3/3 | L10:PCA and Population Structure Slides/Notes DEADLINE TO SELECT PROJECT |
PCA Tutorial: arXiv:1404.1100 FastPCA: bioRxiv:/10.1101/018143 |
L10 Quiz |
Week 7 | |||
3/8 | L11:Statistical Phasing Slides/Notes |
doi:10.1038/nmeth.2307 | L11 Quiz |
3/10 | L12:GWAS + Linear Mixed Models Slides/Notes |
doi:10.1038/nrg1916 doi: 10.1371/journal.pcbi.1002822 kbroman: LMMs and R/lmmlite |
L12 Quiz |
Week 8 | |||
3/15 | NO CLASS SPRING BREAK | ||
3/17 | NO CLASS SPRING BREAK |
||
Week 9 | |||
3/22 | L13:Spliced RNA-seq Alignment Slides/Notes |
doi:10.1038/nmeth.3317 doi: 10.1093/bioinformatics/bts635 |
L13 Quiz |
3/24 | L14:RNA-seq analysis using k-mer counting and EM Slides/Notes |
doi:10.1038/nbt.2862 | L14 Quiz |
Week 10 | |||
3/29 | L15:de novo Transcript Assembly Slides/Notes |
doi:10.1038/nbt.1883 bioRxiv:10.1101/021626 |
L15 Quiz |
3/31 | L16:RNA-seq batch effects Slides/Notes |
doi:10.1371/journal.pgen.0030161 doi:10.1093/nar/gku864 |
L16 Quiz |
Week 11 | |||
4/5 | L17:Predicting Alternative Splicing using DNNs Slides/Notes |
10.1093/bioinformatics/btu277 | L17 Quiz |
4/7 | L18: DeepBind Slides/Notes |
doi:10.1038/nbt.3300 | L18 Quiz |
Week 12 | |||
4/12 | L19:DNAse and Deep Learning Slides/Notes |
biorxiv:10.1101/028399 | L19 Quiz |
4/14 | L20:Non-coding Variants + Deep Learning Slides/Notes |
doi:10.1038/nmeth.3547 | L20 Quiz |
Week 13 | |||
4/19 | L21:Chromatin Segmentation Slides/Notes |
doi:10.1038/nmeth.1937 | L21 Quiz |
4/21 | L22:Chromatin Imputation Slides/Notes |
doi:10.1038/nbt.3157 | L22 Quiz |
Week 14 | |||
4/26 | L23:Enhancer Promoter Interactions Slides/Notes |
doi:10.1038/ng.3539 | L23 Quiz |
4/28 | L24:Single Cell Sequencing Clustering and Psuedo-Time Slides/Notes |
t-SNE doi:10.1038/nbt.2859 |
L24 Quiz |
Week 15 | |||
5/3 | L25:Course Summary Slides/Notes |
||
5/5 | NO CLASS ZIA OUT OF TOWN | ||
Week 16 | |||
5/10 | L27: Project Presentation Day | ||
5/15 | LAST DAY TO TURN-IN FINAL PROJECT REPORT/CODE | ||
5/18 | LAST DAY TO TURN-IN TAKEHOME FINAL |