STORM Research Group
Our primary research goal is to unlock transformative applications in medicine and health by substantially improving the scalability and accessibility of biological data analysis (e.g., omics data). To achieve this goal, we design algorithms and co-design hardware and software for more accurate, high-performing, and energy-efficient omics analysis.
Our STORM Research Group at the University of Maryland spans algorithm design, hardware-software co-design, and AI-driven methods for biological data analysis. For a complete list of publications, please see our publications page, Google Scholar, DBLP, or ORCID.
Research Directions
Multimodal Systems for Biological Data Analysis
Modern biological datasets are large, noisy, and heterogeneous. To identify meaningful insights from complex biological data, combining complementary data modalities can help resolve certain complexities. To this end, we aim to develop multimodal systems that use different types of biological data (e.g., raw electrical signals, basecalled sequences, spatial and image data).
Hardware-Algorithm-Software Co-Design for Portable and Accelerated Genome Analysis
The diverse demands of genomic applications require customized hardware solutions to optimize performance and energy efficiency. We co-design hardware, algorithms, and software for end-to-end genome analysis by exploiting emerging technologies, minimizing data movement within the system, and targeting low-power edge platforms for real-time in-the-field use.
Algorithms and AI for Sequence Analysis
Accurate and scalable sequence analysis underpins many applications in computational genomics. This direction aims to design algorithms and AI methods that 1) tolerate noise and variation beyond exact matching, 2) eliminate redundant computation as references and datasets evolve, and 3) enable population-scale analysis with low latency and energy.
Multimodal Systems for Biological Data Analysis — Key Publications
-
RawHash: enabling fast and accurate real-time analysis of raw nanopore signals for large genomesProceedings of the 31st Annual Conference on Intelligent Systems for Molecular Biology (ISMB 2023) and the 22nd European Conference on Computational Biology (ECCB 2023), Lyon, France, July 2023Preprint (bioRxiv) [online link] [pdf]Social Media Thread [Twitter (X)]
-
RawHash2: Mapping Raw Nanopore Signals Using Hash-Based Seeding and Adaptive QuantizationBioinformatics, July 2024
-
Rawsamble: Overlapping Raw Nanopore Signals using a Hash-based Seeding MechanismBioinformatics, February 2026Social Media Thread [Twitter (X)] [LinkedIn]
-
RawBench: A Comprehensive Benchmarking Framework for Raw Nanopore Signal Analysis TechniquesProceedings of the 16th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2025), Philadelphia, PA, USA, October 2025
-
RawAlign: Accurate, Fast, and Scalable Raw Nanopore Signal Mapping via Combining Seeding and AlignmentIEEE Access, December 2024
Hardware-Algorithm-Software Co-Design — Key Publications
-
ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-efficient Genome Analysis
-
MARS: Processing-In-Memory Acceleration of Raw Signal Genome Analysis Inside the Storage SubsystemProceedings of the 39th International Conference on Supercomputing (ICS 2025), Salt Lake City, Utah, USA, June 2025
-
GenPIP: In-Memory Acceleration of Genome Analysis via Tight Integration of Basecalling and Read MappingProceedings of the 55th International Symposium on Microarchitecture (MICRO 2022), Chicago, IL, USA, October 2022
-
Swordfish: A Framework for Evaluating Deep Neural Network-Based Basecalling Using Computation-In-Memory with Non-Ideal MemristorsProceedings of the 56th International Symposium on Microarchitecture (MICRO 2023), Toronto, ON, Canada, November 2023
Algorithms and AI for Sequence Analysis — Key Publications
-
BLEND: a fast, memory-efficient and accurate mechanism to find fuzzy seed matches in genome analysisNAR Genomics and Bioinformatics (NARGAB), March 2023Preprint (bioRxiv) [online link] [pdf]Social Media Thread [Twitter (X)] [Twitter (X)]
-
Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithmBioinformatics, June 2020
-
TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling FilteringFrontiers in Genetics, September 2024
-
RUBICON: a framework for designing efficient deep learning-based genomic basecallersGenome Biology, February 2024