CMSC 423: Bioinformatics
Fall 2011

Course objectives: Cover interesting algorithms and methods for the analysis of biological data. We will cover string matching algorithms, string searching, string pattern finding (gene finding, discovery of protein binding sites), genome assembly, phylogenetics, and several topics of current research interest in bioinformatics.

Class time: Tue/Thr 12:30am-1:45pm in CSIC 2107.

Professor: Carl Kingsford, Office: CBCB 3113. Email: carlk AT cs.

Office hours: Wednesdays, 11:00-noon in CBCB 3113. If you cannot attend office hours, email me about scheduling a different time.

TA: Darya Filippova (dfilippo AT cs.umd.edu)

TA Office Hours: Mondays 1-3 in CBCB 3118, and Thursdays 10-noon in the office hours room on first floor of AVW. If you cannot get into the building, please use the call box to call Denise Cross (035) or Carl (032).

## Announcements:

• Some collected bioinformatics lectures
• Homework 4
• Partner Evaluation Form
• Python code for Gibbs sampling
• Solution to Project 1 (see email for password)
• Project #2 is posted. It is due on Dec 8 at 11:59pm.
• Project clarifications, etc.:
1. You should create the given output file --- it might not exist. If it does exist, just overwrite it.
2. When computing the alignment SP-score to output on the "SP-score" line, use the rule on page 10 of this lecture. That is: compute the SP-score as sum over all columns of the the substitution score between all pairs of characters in that column. This will ignore the gap open costs. You should still use the gap open costs when computing the pairwise alignments.
4. When you are constructing the progressive alignment, you can sometimes have several legal ways of preserving the mapping I talked about today in class. For example, suppose your pairwise alignments are:
```SC:  cat-----hat    SC: cat--hat
S1:  catinthehat    S2: catinhat
```
When adding S2 to the SC/S1 alignment, you could either have:
```    SC: cat-----hat
S1: catinthehat
S2: catin---hat
```
or
```    SC: cat-----hat
S1: catinthehat
S2: cat---inhat
```
or any other placement of "in" within the gap, as all preserve the alignments with the center sequence SC. Any such alignment is correct.
• Homework #3 is posted and is due on Nov 17 at the start of class.
• Project #1 is posted. It is due on Nov 15 at 11:59pm.
• Darya's office hours today (Monday, Oct 17) will be shifted one hour later to 2-4pm today.
• Answers to some of the homework problems are posted
• Homework 2 is posted, and is due Oct 6 at the start of class.
• Office hours have been changed from what was originally posted. They are now:
Darya: Mondays, 1-3pm in CBCB 3118 and Thursdays 10-noon in AVW TA room.
Carl: Wednesdays, 11-noon in CBCB 3113
• Homework 1 is posted, and due Sept 22 at the start of class.
• Extra Credit (due Thursday, Sep. 15): send me up to 10 word pairs that both (a) have an interesting alignment (gaps, mismatches, unexpected matches) and (b) have some cleaver association with each other. Some that have already been submitted:
gattaca/genetical; transsubstantiation/superstition; unsinkabletitanic/hunkofice; computerscience/liberalarts; goldmansachs/lehmanbrothers; underarmour/nike; bioinformatics/algorithms; fox news/faux news; harrypotter/wizard; einstein/brainiac; starcraft/terriblewasteoftime; programming/painful; money/power; primal/dual; omnipotent/omniscient; grill/skillet; mathematics/computer science; dissertation/thesis; symmetry/dihedral; traffic/pain; submodular tree cover/polymatroid steiner tree; nash equilibrium/dominant strategy

## Handouts:

• Homework 2 (due Oct 6 at the start of class)
• Homework 1 (due Sept 22 at the start of class)
• The syllabus can be found here.