Course Logistics
Instructor: Erin Molloy

Please log into ELMS (button to the left) to see course meeting time and location.

This course covers fundamental computational problems from comparative genomics and evolutionary biology. Example topics include multiple sequence alignment and the reconstruction of evolutionary histories (e.g., phylogenetic trees and networks). These tasks are typically framed as NP-hard optimization problems, motivating the development of heuristics based on graph algorithms and more recently machine learning. We will introduce core algorithms and discuss their performance from the empirical and theoretical perspectives, considering computational complexity, optimality guarantees, and statistical consistency under popular models of evolution. We will also discuss how these algorithms are leveraged in emerging applications, like evolutionary analyses of tumors and pathogens, along with limitations and directions for future research.

CMSC829A should be appeal to students looking for an application-driven cs/stats/math course as well as students specifically interested in bioinformatics research.


CMSC82A is a computer science MS/PhD qualifying course in bioinformatics. (Search CMSC829A under CS grad coursework.) The target audience for this course is graduate students from CS, ECE, AMSC, and statistics or similar programs.

No prior knowledge of biology is a required. Familiarity with algorithms, probability, and basic statistics is required. In addition, you should be comfortable programming in at least one language. The programming assignment will be given in Python; however, you may complete the assignment in any language of your choosing. If you do not use Python, it is your responsibility to re-write any functions that were distributed with the assignment, like those for reading the input data. Please contact me if you are unsure whether you should enroll in the course.

Biology Graduate Students. If you are a graduate student in biology and the course material is relevant to your research, you may contact me about enrolling in this class either for credit or for audit. If taking the course for credit, you should expect to dedicate significant time to course assignments, beginning by doing the recommended readings, starting with Appendix B in Computational Phylogenetics (this chapter will give you a sense of the course material). In previous years, we have offered some modified assignments, for example replacing the programming assignment with a data set analysis assignment. The students eligible for these modifications must be from non-computationally-focused graduate programs, e.g., BEES and Entomology.


Use this website, most things will get linked here!

  • Slides will be posted to ELMS. These are for your own use and should not be distributed.
  • Graded assignments will be posted to ELMS/Gradescope.
  • Graded assignments will be submitted on Gradescope (except for assignments related to the course project).
  • General course communication will be through CampusWire.
    • The instructor will post class-wide announcement to CampusWire. Some announcements may be additionally be posted to ELMS if they are very important and time sensitive.
    • You are responsible for checking your email as well as ELMS and CampusWire with regular frequency.
  • All personal course communication (e.g., about excused absences or grades) must be through ELMS. Please do NOT email the course staff.


All course materials will be posted to ELMS and linked to this website. Coursework can be completed by referring to the lecture slides, so there is no required textbook. However, there are recommended readings the textbook Computational Phylogenetics by my PhD advisor Tandy Warnow. Many students in the past have found these readings helpful for doing the homework. In any case, if you want to work in this field, I strongly encourage you to do the readings.


For course to be MS/PhD qualifying, it "must primarily (at least 75%) base the course grade on a combination of homework, programming assignments, research projects, and exams. Any of these components are optional, except the course's written exam(s) which must account for at least 30% of the grade" (this information is from Tom Hurst). The final grade for CMSC829A will likely have the following breakdown:

  • 30% exam
  • 35% final project
  • 25% written homeworks
  • 10% programming assignment
See the syllabus for details.


For other course policies, refer to the syllabus.


Course evaluations are important, and the department and faculty take student feedback seriously. Near the end of the semester, students can go to to complete their evaluations.


CMSC829A is largely based on a course taught at University of Illinois, Urbana-Champaign, taught by my PhD advisor, Tandy Warnow.

The image above illustrates evolution at a particular region of the genome (called a locus) for a species network (a) and species tree (b). Yunheng Han and I created these image using butterflies from this paper.

Web Accessibility
Please provide feedback on web accessibility to the instructor.