CMSC498Y meets on Mondays and Wednesdays.
The course schedule will be updated here, along with links to materials and assignments.
Slides will be posted to ELMS before the lecture and then linked here after the lecture.
If lecture is recorded, the video will typically be available under the zoom tab on ELMS. The overall schedule will be similar to spring 2024, with some improvements to pacing and content. There will also be 2 in-class midterms this year instead of 1.
Readings come from one of three textbooks:
- BSA - Biological Sequence Analysis textbook by Durbin, Eddy, Krogh, and Mitchison.
- DL - Deep Learning by Ian Goodfellow, Yuoshua Bengio, and Aaron Courville.
- ISB - Introduction to Structural Bioinformatics edited by K. Anton Feenstra and Sanne Abeln
Week | Day | Date | Module | Topic | Materials | Assigned Reading |
---|---|---|---|---|---|---|
1 | Mon | Jan 27 | Welcome! |
Course Overview, Policies, and Background Biology
[1_course_overview_and_biology.pdf] |
None | |
Wed | Jan 29 | A | Random Sequence Model |
Random Sequence Model and Statistics Review (e.g., models, data, likelihood, maximum likelihood estimation, KL divergence)
[2_random_sequence_model_and_stats_review.pdf] Assignment #1 released. |
BSA Sections 1.3, 11.1 (through multinomial), 11.2 (relative entropy only), 11.3 (ML only), 11.5 (ML only), [bonus_probability_review.pdf] (optional) | |
2 | Mon | Feb 3 | A | Markov Models |
Bayes Classifier, Markov Models, Hidden Markov Models (definition only)
[3_markov_models.pdf] |
BSA Sections 3.1, 3.2 (through formal definition) |
Wed | Feb 5 | A | No class |
Free period to work on Assignment #1
Add/drop deadline in two days (Fri Feb 7) |
None | |
3 | Mon | Feb 10 | A | Decoding HMMs |
Viterbi algorithm, Posterior decoding (Forward algorithm, Backward algorithm)
[4_decoding_hmms.pdf] |
BSA Section 3.2 |
Wed | Feb 12 | B | MSAs |
Multiple Sequence Alignment (MSA) definition, Sum-of-Pairs Error, Edit distance, Sum-of-Pairs alignment, Star Alignment Heuristic
[5_msa.pdf] |
None | |
4 | Mon | Feb 17 | B | Profile HMMs |
Unadjusted Sequence Profile, Profile HMMs, Supervised Training given MSA, Decoding with Viterbi algorithm
[6_profile_hmm.pdf] Assignment #2 released. |
BSA Chapter 5 (through 5.4) |
Wed | Feb 19 | B | Training HMMs |
Supervised training, Psuedocounts, Unsupervised training, Viterbi training, Baum-Welch (Expectation-Maximization)
[7_hmm_training.pdf] |
BSA Section 3.3 | |
5 | Mon | Feb 24 | B | Lab |
Profile HMM Lab - Adversarial example of failure when using Viterbi to align query sequence to MSA through profile HMM
[8_profile_hmm_lab.pdf] [profile-hmm-lab.zip] |
Lab Manual PDF |
Wed | Feb 26 | A/B | Midterm #1 Review |
Discussion of course material on midterm exam #1
IMPORTANT: No zoom recording available [midterm1_study_guide.pdf] [midterm1_equation_sheet.pdf] |
Study Guide PDF | |
6 | Mon | Mar 3 | A/B | Midterm Exam #1 | Midterm exam in-class covering modules A and B | |
Wed | Mar 5 | C | RNA Secondary Structure |
RNA secondary structure definition, evolutionary constraints, information degeneracy, psuedoknots, input/output to prediction problem, accuracy calculations
[9_rna_secondary_structure.pdf] |
BSA Chapter 10 (through 10.2) | |
7 | Mon | Mar 10 | C | Grammars |
Grammars, Moore vs. Mealy Machines, Context-Free Grammars (CFGs), Stochastic CFGs
[10_scfgs.pdf] |
BSA Chapter 9 (skip Section 9.4) |
Wed | Mar 12 | C | Optimization |
Maximium Base Pairs (Nussinov's Algorithm) and Minimum Energy
[11_rna_opt.pdf] |
BSA Section 10.2 through first sub-section on Energy minimization (skip SCFG sub-section) | |
8 | Mon | Mar 17 | No class | Spring break | ||
Wed | Mar 18 | No class | Spring break | |||
9 | Mon | Mar 24 | C | UFold |
UFold Input / Ouput, Feature Construction, Output Postprocessing
[12_ufold.pdf] Asssignment #3 released. |
(Optional) Ufold paper, CDPfold paper, E2Efold paper |
Wed | Mar 26 | C | Lab |
UFold Lab, part 1: Software installation and basic usage
[ufold-lab] |
Lab Manual PDF | |
10 | Mon | Mar 31 | C | Neural Networks |
Feedforward neural networks, Cost functions (e.g. cross-entropy), Output units and activation functions (sigmoid, softmax, linear, ReLU), Optimization methods
[13_feedforward_neural_networks.pdf] |
DL Chapter 6 |
Wed | Apr 2 | C | UNet |
UNet Encoder / Contraction Path (e.g., Convolution and Max Pool), UNet Decoder / Expansion Path (e.g., Convolution and Upsampling)
[14_unet.pdf] |
DL Chapter 9 | |
11 | Mon | Apr 7 | C | Lab |
UFold Lab, part 2: Data curation, training, and ablation studies
[15_ufold_lab.pdf] |
Lab Manual PDF |
Wed | Apr 9 | C | Midterm #2 Review |
Discussion of course material on midterm exam #2
IMPORTANT: No zoom recording available [midterm2_study_guide.pdf] Drop with a W deadline in two days (Fri Apr 11) |
Study Guide PDF | |
12 | Mon | Apr 14 | D | Intro to Protein Structures |
Amino acids, backbone, side chain, alpha-helix, beta-strand, beta-sheet, phi/psi angles, Ramanchandran Principle
[16_protein_structure.pdf] |
ISB Chapter 1 |
Wed | Apr 16 | C | Midterm Exam #2 | Midterm exam in-class covering module C | ||
13 | Mon | Apr 21 | D | Protein Secondary Structure Prediction |
Multi-class classification, micro-average, macro-average, class imbalance, segment of overlap (SOV) score, PhD method (profile + neural network)
[17_protein_secondary_structure_prediction.pdf] |
|
Wed | Apr 23 | D | Protein Language Models |
Pretraining, Masked Language Models, Self-Attention, Multi-head Attention
[18_protein_language_models.pdf] |
(Optional) ESM paper, Attention Is All You Need paper (also see related wikipedia page, blog posts, etc.) | |
14 | Mon | Apr 28 | D | pLMs, cont. |
Contact Prediction and Self-Attention, Experimental Evaluation of pLMs, Categorical Jacobian, Evaluation metrics for tertiary structure prediction
[19_misc.pdf] Assignment #4 and assignment #5 released. |
(Optional) pLMs learn paper |
Wed | Apr 30 | D | AlphaFold2 Overview |
Alphafold2 Overview, Inputs, and Featurization
[20_alphafold2_overview.pdf] |
(Optional) alphafold2 paper | |
15 | Mon | May 5 | D | Alphafold2 Evoformer |
Alphafold2 Initialization of MSA and pair representation, Evoformer, Self-Attention (again), Outer mean product, Triangle updates (with and without self-attention)
[21_alphafold2_evoformer.pdf] |
(Optional) alphafold2 paper |
Wed | May 7 | D | ||||
16 | Mon | May 12 | A-D | Final Review |
TBD
|
Study Guide PDF |
Wed | May 4 | No class | Reading day | |||
Mon | May 19 | Final exam 4-6pm |