CMSC 838T Project 1
The goal of this assignment is to familiarize yourself with
a number of web-based bioinformatic tools, by attempting to
discover useful information about some protein
sequences from the CASP-5 competition. Present what
you discover in a short research-style paper.
MLISHSDLNQ QLKSAGIGFN ATELHGFLSG LLCGGLKDQS WLPLLYQFSN
DNHAYPTGLV QPVTELYEQI SQTLSDVEGF TFELGLTEDE NVFTQADSLS
DWANQFLLGI GLAQPELAKE KGEIGEAVDD LQDICQLGYD EDDNEEELAE
ALEEIIEYVR TIAMLFYSHF NEGEIESKPV LH
Perform the following analyses for each protein sequence
- Find similar protein sequences
- Find protein family / conserved regions using automated tools
- Use HMMER at
Select "HMM Domain Search" from "Search and Retrieval" menu
- Does target protein belong to family with known function?
- Perform multiple sequence alignment to identify conserved regions
and infer phylogenetic trees
- Predict secondary structure, 3D structure
- Look for source of protein in genomic DNA, cDNA
- Use translated BLAST at
- Select "Protein query - Translated db [tblastn]"
- Select "Choose database", try both the default (nr)
and genomic DNA (GSS) only.
What are possible sources of target protein?
- What species? Genomic DNA or other source?