| CMSC 838T Project 2
|
The goal of this project is to find a bioinformatics research topic related to
high-performance computing and obtain some preliminary results.
The motivating factor for your project should be to figure out how to take
advantage of increasing computation power to improve the quality of
bioinformatic applications, while taking into account the near-exponential
growth in the size of sequence databases. Earlier algorithms may be
oversimplified due to concerns about processing power. Try to find opportunities
to use computing power to save user effort.
Project topics
1) Performance of parallel bioinformatic algorithms
- Evaluate / schedule mpiBLAST & threaded BLAST
- Vinay, Indrajit, Cihan
- compare throughput vs parallel speedup
- evaluate performance / scalability on Sun SMP, Linux PCs
- evaluate performance / scalability vs query length, database size
- develop on-line scheduling algorithms depending on workload, environment
- Links
- SGI parallel bioinformatic evaluation here...
- Parallel Smith-Waterman evaluation here...
- MPI-BLAST benchmark results here...
- DeCypher benchmark results here...
- Evaluating threaded BLAST
- Asta, Yuan
- compared threaded BLAST to potential parallel implementation in OpenMP
- Evaluating bottlenecks in parallel BLAST
- Aaron, Qing
- find sources of sequential execution / synchronization overhead
- develop & evaluate program transformations to eliminate bottlenecks
- Tuning BLAST to improve locality / performance
- Arunesh
- find sources of cache misses / TLB misses / memory paging
- develop & evaluate program transformations to eliminate bottlenecks
- Investigating high-throughput BLAST
- Xue
- evaluate techniques for improving BLAST performance for multiple queries
- compare performance of HT-BLAST / BLAST++
- develop / evaluate new algorithms for improving multiple BLAST queries
- Links
- BLAST++: A Tool for BLASTing Queries in Batches
(PDF)
- Enhancing BLAST to exploit MMX / SSE2 instructions
- Bin, Reddy
- find main sources of computation
- exploit SIMD instruction extensions to improve performance
- Evaluating parallel NAMD
- Reddy
- compare parallel performance of NAMD molecular dynamics code
- evaluate performance / scalability on Sun SMP, Linux PCs
2) Precision of bioinformatic algorithms
- Evaluating precision of sequential & parallel EST clustering algorithms
- Annie, Nargess, Earl
- evaluate performance / accuracy of parallel EST clustering algorithms
- compare PaCE / NCBI megablast / TIGR clustering software
- Evaluating precision of hybrid BLAST / S-W pairwise alignment
- Kexue
- evaluate precision / performance of combining S-W with BLAST
- Links
- Assessing Sequence Comparison Methods
(PDF)...
- Sensitivity and Selectivity in Protein Similarity Searches...
(PDF)...
- PatternHunter...
(PDF)...
- Evaluating precision of gene prediction software
- Dami, Hyma
- compare precision of gene prediction software
- Links
- A Comparative Guide to Gene Prediction Tools...
(PDF)
- Evaluate precision of sequence search on clustered sequence databases
- Michael
- investigate approaches for searching clustered sequence databases
- Links
- Indexing Genomic Databases for Fast Homology Searching
(PDF)
- A Compression Algorithm for DNA Sequences...
(PDF)
Approach
- Conduct survey of related work (read related research papers)
- Write up a short description of proposed research before proceeding
- Initially concentrate on setting up tools / procedures
- Later focus on collecting experimental information
- Present preliminary results on last day of class
- Turn in short research paper describing project
Original project suggestions