CMSC498Y meets on Mondays and Wednesdays. The course schedule will be updated here, along with links to materials and assignments. Slides will be posted to ELMS before the lecture and then linked here after the lecture. If lecture is recorded, the video will typically be available under the zoom tab on ELMS. The overall schedule will be similar to spring 2024, with some improvements to pacing and content. There will also be 2 in-class midterms this year instead of 1.

Readings come from one of three textbooks:

Week Day Date Module Topic Materials Assigned Reading
1 Mon Jan 27 Welcome! Course Overview, Policies, and Background Biology
[1_course_overview_and_biology.pdf]
None
Wed Jan 29 A Random Sequence Model Random Sequence Model and Statistics Review (e.g., models, data, likelihood, maximum likelihood estimation, KL divergence)
[2_random_sequence_model_and_stats_review.pdf]
Assignment #1 released.
BSA Sections 1.3, 11.1 (through multinomial), 11.2 (relative entropy only), 11.3 (ML only), 11.5 (ML only), [bonus_probability_review.pdf] (optional)
2 Mon Feb 3 A Markov Models Bayes Classifier, Markov Models, Hidden Markov Models (definition only)
[3_markov_models.pdf]
BSA Sections 3.1, 3.2 (through formal definition)
Wed Feb 5 A No class Free period to work on Assignment #1
Add/drop deadline in two days (Fri Feb 7)
None
3 Mon Feb 10 A Decoding HMMs Viterbi algorithm, Posterior decoding (Forward algorithm, Backward algorithm)
[4_decoding_hmms.pdf]
BSA Section 3.2
Wed Feb 12 B MSAs Multiple Sequence Alignment (MSA) definition, Sum-of-Pairs Error, Edit distance, Sum-of-Pairs alignment, Star Alignment Heuristic
[5_msa.pdf]
Assignment #1 due today at 11:59pm.
None
4 Mon Feb 17 B Profile HMMs Unadjusted Sequence Profile, Profile HMMs, Supervised Training given MSA, Decoding with Viterbi algorithm
[6_profile_hmm.pdf]
Assignment #2 released.
BSA Chapter 5 (through 5.4)
Wed Feb 19 B Training HMMs Supervised training, Psuedocounts, Unsupervised training, Viterbi training, Baum-Welch (Expectation-Maximization)
[7_hmm_training.pdf]
BSA Section 3.3
5 Mon Feb 24 B Lab Profile HMM Lab - Adversarial example of failure when using Viterbi to align query sequence to MSA through profile HMM
[8_profile_hmm_lab.pdf]
[profile-hmm-lab.zip]
Lab Manual PDF
Wed Feb 26 A/B Midterm #1 Review Discussion of course material on midterm exam #1
IMPORTANT: No zoom recording available
[midterm1_study_guide.pdf]
[midterm1_equation_sheet.pdf]
Assignment #2 due today at 11:59pm.
Study Guide PDF
6 Mon Mar 3 A/B Midterm Exam #1 Midterm exam in-class covering modules A and B
Wed Mar 5 C RNA Secondary Structure RNA secondary structure definition, evolutionary constraints, information degeneracy, psuedoknots, input/output to prediction problem, accuracy calculations
[9_rna_secondary_structure.pdf]
BSA Chapter 10 (through 10.2)
7 Mon Mar 10 C Grammars Grammars, Moore vs. Mealy Machines, Context-Free Grammars (CFGs), Stochastic CFGs
[10_scfgs.pdf]
BSA Chapter 9 (skip Section 9.4)
Wed Mar 12 C Optimization Maximium Base Pairs (Nussinov's Algorithm) and Minimum Energy
[11_rna_opt.pdf]
BSA Section 10.2 through first sub-section on Energy minimization (skip SCFG sub-section)
8 Mon Mar 17 No class Spring break
Wed Mar 18 No class Spring break
9 Mon Mar 24 C UFold UFold Input / Ouput, Feature Construction, Output Postprocessing
[12_ufold.pdf]
Asssignment #3 released.
(Optional) Ufold paper, CDPfold paper, E2Efold paper
Wed Mar 26 C Lab UFold Lab, part 1: Software installation and basic usage
[ufold-lab]
Lab Manual PDF
10 Mon Mar 31 C Neural Networks Feedforward neural networks, Cost functions (e.g. cross-entropy), Output units and activation functions (sigmoid, softmax, linear, ReLU), Optimization methods
[13_feedforward_neural_networks.pdf]
DL Chapter 6
Wed Apr 2 C UNet UNet Encoder / Contraction Path (e.g., Convolution and Max Pool), UNet Decoder / Expansion Path (e.g., Convolution and Upsampling)
[14_unet.pdf]
DL Chapter 9
11 Mon Apr 7 C Lab UFold Lab, part 2: Data curation, training, and ablation studies
[15_ufold_lab.pdf]
Assignment #3 due today at 11:59pm.
Lab Manual PDF
Wed Apr 9 C Midterm #2 Review Discussion of course material on midterm exam #2
IMPORTANT: No zoom recording available
[midterm2_study_guide.pdf]
Drop with a W deadline in two days (Fri Apr 11)
Study Guide PDF
12 Mon Apr 14 D Intro to Protein Structures Amino acids, backbone, side chain, alpha-helix, beta-strand, beta-sheet, phi/psi angles, Ramanchandran Principle
[16_protein_structure.pdf]
ISB Chapter 1
Wed Apr 16 C Midterm Exam #2 Midterm exam in-class covering module C
13 Mon Apr 21 D Protein Secondary Structure Prediction Multi-class classification, micro-average, macro-average, class imbalance, segment of overlap (SOV) score, PhD method (profile + neural network)
[17_protein_secondary_structure_prediction.pdf]
Wed Apr 23 D Protein Language Models Pretraining, Masked Language Models, Self-Attention, Multi-head Attention
[18_protein_language_models.pdf]
(Optional) ESM paper, Attention Is All You Need paper (also see related wikipedia page, blog posts, etc.)
14 Mon Apr 28 D pLMs, cont. Contact Prediction and Self-Attention, Experimental Evaluation of pLMs, Categorical Jacobian, Evaluation metrics for tertiary structure prediction
[19_misc.pdf]
Assignment #4 and assignment #5 released.
(Optional) pLMs learn paper
Wed Apr 30 D AlphaFold2 Overview Alphafold2 Overview, Inputs, and Featurization
[20_alphafold2_overview.pdf]
(Optional) alphafold2 paper
15 Mon May 5 D Alphafold2 Evoformer Alphafold2 Initialization of MSA and pair representation, Evoformer, Self-Attention (again), Outer mean product, Triangle updates (with and without self-attention)
[21_alphafold2_evoformer.pdf]
Assignment #4 due today at 11:59pm.
(Optional) alphafold2 paper
Wed May 7 D
16 Mon May 12 A-D Final Review TBD
Assignment #5 due today at 11:59pm.
Study Guide PDF
Wed May 4 No class Reading day
Mon May 19 Final exam 4-6pm