Abhinav Bhatele on Advancing AI Through High Performance Computing
University of Maryland Associate Professor of Computer Science Abhinav Bhatele traces his interest in computing to early classroom experiences, where exposure to programming languages shaped his academic direction. His work now focuses on the intersection of high-performance computing (HPC) and artificial intelligence, with an emphasis on cross-domain collaboration.
In this Q&A, Bhatele discusses his career path, research priorities and the broader implications of his work.
Was there a defining moment that shaped your career path into computer science?
I don’t know if there was a single defining moment. In middle school, we got new computers and learned a language called Logo, where we moved a turtle around on the screen. That experience initially got me interested in computers. In high school, when I had to pick my courses, I chose computer science and continued building on that interest. From there, I just kept going.
Can you tell me about your research focus and what drew you to that field?
The theme of my research group is to do core computer science work while also supporting other scientific domains. That idea comes from my Ph.D. experience, where there was a strong emphasis on helping other fields, such as chemistry and biophysics.
Since I work in HPC, we collaborate with researchers in areas such as climate science, hypersonics and computational epidemiology. These collaborations focus on optimizing computational workflows. For me, AI is similar in that it is another “science” domain that benefits from HPC. Our goal is to use large-scale HPC infrastructure to improve and scale AI methods.
What are you currently working on, and what stands out in those projects?
There are several projects in my group, but two areas highlight the connection between computer systems and AI.
One focuses on using HPC to advance AI. Large language models are trained on large data centers and supercomputers, so we study how to improve training, fine-tuning and inference at scale. Our goal is to democratize AI by creating scalable open-source software such as AxoNN.
We are also working on scaling the training of graph neural networks. Graph data appears in many domains, including finance and biology, but training these models is highly challenging to scale. We have been developing a framework called Plexus to improve performance across many GPUs.
The second area looks at the reverse problem – using AI to improve computing systems. We are studying whether language models can generate or optimize parallel code. Writing parallel programs is complex, and debugging or improving their performance is challenging. We are exploring how AI models can assist with code generation, translation and optimization in this space.
Can you describe your lab and how it is structured?
My group, the Parallel Software and Systems Group, has eight graduate students, with three graduating this semester. We will have several new Ph.D. students joining in the fall. At any time, we also have several master’s students and undergraduate researchers.
Undergraduates are paired with graduate students, allowing them to contribute to projects while also giving graduate students mentoring experience.
What is one challenge you have encountered in your research, and how are you addressing it?
One challenge is that current language models do not perform well when generating or optimizing parallel code. A key issue is the limited amount of training data available for parallel programming and the context length limits imposed by large language models.
We are exploring ways to address this, including generating synthetic data to improve model training. We are also looking at approaches that use multiple models working together as agents on different parts of a task, rather than relying on a single large model.
How does your work connect to the broader computer science community and society?
Some of the areas we work in have direct societal relevance. For example, computational epidemiology studies how diseases spread through populations. That research can inform policy decisions about interventions such as masking or school closures.
Graph neural networks can be used for a variety of uses, such as analyzing climate data for future predictions or understanding the function of “dark” proteins – proteins that we do not fully understand yet.
What inspired you to join the University of Maryland?
The university has a strong computer science program and offers opportunities to work with students and faculty across different areas, even outside of computer science. That environment and the fact that collaborative research is valued were important factors.
The location also played a role. The Washington, D.C., area offers many of the advantages of a large metropolitan region, including access to a variety of dining experiences and cultural and outdoor activities.
What advice would you give to students interested in this field?
There is no shortcut to hard work. Progress in research takes time and effort, and building a strong foundation is important. Consistent work is the most reliable way to move forward.
—Story by Samuel Malede Zewdu, CS Communications
The Department welcomes comments, suggestions and corrections. Send email to editor [-at-] cs [dot] umd [dot] edu.
