CMSC 423: Bioinformatics
Fall 2011

Course objectives: Cover interesting algorithms and methods for the analysis of biological data. We will cover string matching algorithms, string searching, string pattern finding (gene finding, discovery of protein binding sites), genome assembly, phylogenetics, and several topics of current research interest in bioinformatics.

## Announcements:

• Project clarifications, etc.:
1. You should create the given output file --- it might not exist. If it does exist, just overwrite it.
2. When computing the alignment SP-score to output on the "SP-score" line, use the rule on page 10 of this lecture. That is: compute the SP-score as sum over all columns of the the substitution score between all pairs of characters in that column. This will ignore the gap open costs. You should still use the gap open costs when computing the pairwise alignments.
4. When you are constructing the progressive alignment, you can sometimes have several legal ways of preserving the mapping I talked about today in class. For example, suppose your pairwise alignments are:
```SC:  cat-----hat    SC: cat--hat
S1:  catinthehat    S2: catinhat
```
When adding S2 to the SC/S1 alignment, you could either have:
```    SC: cat-----hat
S1: catinthehat
S2: catin---hat
```
or
```    SC: cat-----hat
S1: catinthehat
S2: cat---inhat
```
or any other placement of "in" within the gap, as all preserve the alignments with the center sequence SC. Any such alignment is correct.
• Extra Credit (due Thursday, Sep. 15): send me up to 10 word pairs that both (a) have an interesting alignment (gaps, mismatches, unexpected matches) and (b) have some cleaver association with each other. Some that have already been submitted:
gattaca/genetical; transsubstantiation/superstition; unsinkabletitanic/hunkofice; computerscience/liberalarts; goldmansachs/lehmanbrothers; underarmour/nike; bioinformatics/algorithms; fox news/faux news; harrypotter/wizard; einstein/brainiac; starcraft/terriblewasteoftime; programming/painful; money/power; primal/dual; omnipotent/omniscient; grill/skillet; mathematics/computer science; dissertation/thesis; symmetry/dihedral; traffic/pain; submodular tree cover/polymatroid steiner tree; nash equilibrium/dominant strategy

