I am an assistant professor in the Department of Computer Science at the University of Maryland, Collage Park, with a joint appointment in UMIACS.
Before this, I spent a year at Google's Systems and Services Infrastructure team. I received my Ph.D. in electrical and computer engineering from Georgia Tech in 2021. My research revolves around accelerating sparse problems, near memory processing, and using machine learning to design efficient reconfigurable domain-specific architecture. Learn more about my research and the courses I teach under My Work tab.
The Together We Are Visible shows four women and men dancing together in a field of sweet irises. The dancers reflect each other and share their positive feelings to get stronger and visible..
The Union Harmony shows the beauty of world, made up of elements such as a goldfish, a symbol of life, eyes, the gateways into the soul, lotus, the symbol of purity of the body and mind. All of them are connected through music, a shared language of human regardless their races.
This painting is another view of the “Naples” drawing.
Time Vertigo visualizes space-time simultaneously by mixing colors and geometric shapes. This painting is a deep envision of the “Gravity” drawing. Simplifying the minor details of a scene, happing in a known time and location (i.e., Paris), and bringing the other scenes happing simultaneously to the same location is the key insight of the painting. The six characters in the painting are basically representative of the two main dancers, in three captures of time, past, present, and future, but all happening concurrently.
My research interests have revolved around efficiently accelerating the execution of sparse problems – computer programs in which data lacks spatial locality in memory. Sparse problems are the main component in several crucial domains such as robotics, recommendation systems, machine learning and computer vision, graph analytics, and scientific computing that remarkably impact human life. For instance, modeling/simulating a vaccine or predicting an earthquake are examples of sparse scientific computing that can save lives if done accurately and in a timely manner. Several supercomputers from Google Could, Amazon Web Service, Microsoft Azure, and IBM that are now running on over 136 thousand nodes containing five million processor cores and more than 50 thousand GPUs  for vaccine development is compelling evidence of the importance of sparse scientific computations. However, modern high-performance computers equipped with CPUs and/or GPUs are poorly suited to these sparse problems, utilizing a tiny fraction of their peak performance (e.g., 0.5% - 3% ). Such conventional architectures are mainly optimized to handle complex computation rather than the complex memory accesses that are essential for sparse problems. The contradiction between the abilities of the hardware and the nature of the problem causes sparse problems to waste extra hardware budget (high power and dollar cost) for higher performance. The goal of our research is to propose effective solutions that utilize the maximum potential of a given hardware budget to accelerate sparse problems. To be impactful in achieving the goal, our research suggests that software and hardware must be co-optimized. To date, several software- and hardware-level optimizations have been proposed to accelerate sparse problems; however, since they optimize either the software or the hardware in isolation, they have not fully resolved the challenges of sparse problems. Table above summarizes the common challenges of sparse problems, examples of sparse applications that suffer from these challenges, as well as my contributions and publications to resolve these challenges along with the broader impact that my research would have.
My research also focuses on developing a novel dynamically reconfigurable computation platform that provides maximum performance for distinct applications with diverse requirements. For several years, general-purpose processors were optimized for the common case and were still able to provide reasonable performance for a wide range of applications. Today, diverse applications with distinct requirements are quickly being developed that cannot reach their maximum performance on a single hardware platform. Therefore, today, we see huge efforts to produce specialized hardware such as those I proposed for sparse problems (see the table above). As applications become more diverse and evolve quickly, designing specialized hardware will no longer be a practical solution for two reasons: (i) The process for such designs is costly, requires expertise across several domains, and is slow and therefore cannot keep up with the fast pace of algorithm development; and (ii) A system of several fixed specialized hardware platforms is not a scalable solution. CASL, my research group at UMD introduces and develops a novel approach for dynamically reconfigurable computation to be substituted with the current general-purpose processors and specialized hardware. Our new computation approach envisions a key characteristic: Hardware and software are treated as a single unified component. To simultaneously execute distinct programs, the hardware is reconfigured. Thus, each program reaches its optimized performance without sacrificing its performance for the common case.
2023 © Bahar Asgari