CMSC498Y meets on Mondays and Wednesdays.
The course schedule will be updated here, along with links to materials and assignments.
Slides will be posted to ELMS-Canvas before the lecture and then linked after the lecture.
If lecture is recorded, the video will typically be available under the zoom tab on ELMS.
The overall schedule will be similar to last year (spring 2025), with some improvements to pacing and content.
Readings come from journal articles or one of three textbooks:
- BSA - Biological Sequence Analysis textbook by Durbin, Eddy, Krogh, and Mitchison [UMD Library]
- DL - Deep Learning by Ian Goodfellow, Yuoshua Bengio, and Aaron Courville
- ISB - Introduction to Structural Bioinformatics edited by K. Anton Feenstra and Sanne Abeln
| Week | Day | Date | Module | Topic | Materials | Assigned Reading |
|---|---|---|---|---|---|---|
| 1 | Mon | Jan 26 | No class | Inclement weather | ||
| Wed | Jan 28 | Welcome! | Course Overview, Policies, and Basic Biology Review [slides] | None | ||
| 2 | Mon | Feb 2 | A | Random Sequence Model |
Random Sequence Model and Statistics Review (including maximum likelihood parameter estimation)
[slides]
Assignment #1 released. |
BSA Sections 1.3, 11.1 (through multinomial), 11.2 (only relative entropy), 11.3 (only ML), 11.5 (only ML) |
| Wed | Feb 4 | A | Markov Models | Binary Classification, Bayes Classifier, Classifier Evaluation (precision, recall, etc), Markov Models, Hidden Markov Models (definition only) [slides] | BSA Sections 3.1, 3.2 (through formal definition of HMM) | |
| 3 | Mon | Feb 9 | A | Decoding HMMs | Viterbi algorithm, Posterior decoding (Forward + Backward algorithm), Recursion vs. Dynamic Programming [slides] | BSA Section 3.2 |
| Wed | Feb 11 | B | MSAs |
Multiple Sequence Alignment (MSA), Sum-of-Pairs (SOP) error, Hamming and edit distance, Sum-of-Pairs (SOP) alignment, Star alignment heuristic
[slides]
Also, add /drop period ends in two days (Fri Feb 13) |
None | |
| 4 | Mon | Feb 16 | B | Profile HMMs |
Unadjusted Sequence Profiles, Profile HMMs, Supervised Training given MSA, Pseudocounts, Decoding with Viterbi Algorithm, Application to Protein Family Prediction
[slides]
Assignment #2 released. |
BSA Chapter 5 (through Section 5.4) |
| Wed | Feb 18 | B | Profile HMM Lab | Profile HMM Lab [materials] | ||
| 5 | Mon | Feb 23 | B | Unsupervised Training HMMs |
(not on first midterm) Viterbi training, Expectation-Maximization
[slides] |
BSA Section 3.3 |
| Wed | Feb 25 | A/B | Midterm #1 Review |
Q&A of course material on midterm exam #1
IMPORTANT: No zoom attendence or recording available |
Study Guide PDF | |
| 6 | Mon | Mar 2 | A/B | Midterm Exam #1 | Midterm exam in-class covering modules A and B (except unsupervised training) | |
| Wed | Mar 4 | C | RNA Secondary Structure | RNA secondary structure definition, evolutionary constraints, information degeneracy, psuedoknots, input/output to prediction problem, accuracy calculations [slides] | BSA Chapter 10 (through 10.2) | |
| 7 | Mon | Mar 9 | C | Optimization |
Maximium Base Pairs (Nussinov's Algorithm) and Minimum Energy
[slides] |
BSA Section 10.2 through first sub-section on Energy minimization (skip SCFG sub-section) |
| Wed | Mar 11 | C | Midterm 1 debrief | Midterm 1 debrief | ||
| 8 | Mon | Mar 16 | No class | Spring break | ||
| Wed | Mar 18 | No class | Spring break | |||
| 9 | Mon | Mar 23 | C | Grammars |
Grammars, Moore vs. Mealy Machines, Context-Free Grammars (CFGs), Stochastic CFGs
[slides] Also, Midterm exam #1 debrief (IMPORTANT: No zoom attendence or recording available) Assignment #3 released. |
BSA Chapter 9 (skip Section 9.4) |
| Wed | Mar 25 | C | Neural Networks | Feedforward neural networks, Cost functions (e.g. cross-entropy), Output units and activation functions (sigmoid, softmax, linear, ReLU), Optimization methods | DL Chapter 6 | |
| 10 | Mon | Mar 30 | C | UFold |
UFold Input / Ouput, Feature Construction, Output Postprocessing
|
(Optional) Ufold paper, CDPfold paper, E2Efold paper |
| Wed | Apr 1 | C | UNet | UNet Encoder / Contraction Path (e.g., Convolution and Max Pool), UNet Decoder / Expansion Path (e.g., Convolution and Upsampling) | DL Chapter 9 | |
| 11 | Mon | Apr 6 | C | UFold Lab |
UFold Lab
|
|
| Wed | Apr 8 | C | Midterm #2 Review |
Discussion of course material on midterm exam #2
IMPORTANT: No zoom attendence or recording available Also, drop with a W deadline in two days (Fri Apr 10) |
Study Guide PDF | |
| 12 | Mon | Apr 13 | D | Intro to Proteins | Amino acids, backbone, side chain, alpha-helix, beta-sheet, phi/psi angles, data banks, three prediction problems (secondary structure, and teritiary structure, and contacts) | ISB Chapter 1 |
| Wed | Apr 15 | C | Midterm Exam #2 | Midterm exam in-class covering module C | ||
| Mon | Apr 20 | D | Secondary Structure | Multi-class classification, micro-average, macro-average, class imbalance, segment of overlap (SOV) score, PHD method (feedforward neural network) | ||
| 13 | Wed | Apr 22 | D | Protein Language Models | Protein Language Models (pLMs), Attention, Transformers, Masked Language Models (LMs) | (Optional) ESM paper, Attention Is All You Need paper (also see related wikipedia page, blog posts, etc.) |
| Mon | Apr 27 | D | pLM Lab | Contact Prediction and Self Attention | ||
| 14 | Wed | Apr 29 | D | pLM Lab, cont. | Co-evolution and Categorical Jacobian | |
| Mon | May 4 | D | Teritary Structure | Evalution metrics - RMSD, TM score, GDT; Structure module - Backbone Frame, Torsion Angles, FAPE | (Optional) alphafold2 paper | |
| 15 | Wed | May 6 | D | Teritary Structure, cont. | Evoformer Module + Alphafold2 Featurization | (Optional) alphafold2 paper |
| 16 | Mon | May 11 | A-D | Final Review |
Discussion of course material on final exam
IMPORTANT: No zoom attendence or recording available |
Study Guide PDF |
| Wed | May 13 | No class | Reading day | |||
| Fri | May 15 | Final Exam | Final exam in-class from 10:30am-12:30pm |