UMD logo

CMSC 838T Project 1

Project Description

The goal of this assignment is to familiarize yourself with a number of web-based bioinformatic tools, by attempting to discover useful information about some protein sequences from the CASP-5 competition. Present what you discover in a short research-style paper.

Protein Sequences

  1. MLISHSDLNQ QLKSAGIGFN ATELHGFLSG LLCGGLKDQS WLPLLYQFSN DNHAYPTGLV QPVTELYEQI SQTLSDVEGF TFELGLTEDE NVFTQADSLS DWANQFLLGI GLAQPELAKE KGEIGEAVDD LQDICQLGYD EDDNEEELAE ALEEIIEYVR TIAMLFYSHF NEGEIESKPV LH

  2. TGISRETSSDVALASHILTALREKQAPELSLSSQDLELVTKEDPKALAVALNW DIKKTETVQEACERELALRLQQTQSLHSLR

  3. QPAKKTYTWNTKEEAKQAFKELLKEKRVPSNASWEQAMKMIINDPRYSALANLSE KKQAFNAYKVQTEK

  4. MSTVTKYFYKGENTDLIVFAASEELVDEYLKNPSIGKLSEVVELFEVFTPQDGRGA EGELGAASKAQVENEFGKGKKIEEVIDLILRNGKPNSTTSSLKTKGGNAGTKAYN

Perform the following analyses for each protein sequence

  1. Find similar protein sequences

  2. Find protein family / conserved regions using automated tools

  3. Perform multiple sequence alignment to identify conserved regions and infer phylogenetic trees

  4. Predict secondary structure, 3D structure

  5. Look for source of protein in genomic DNA, cDNA