I work on program synthesis within the context of an algorithmic synthetic biology project on microbial chemical factories. My expertise is in formal methods, specifically program synthesis and program verification. My dissertation explored SAT/SMT-based techniques for program synthesis and verification, while my postdoc work has been directed towards bringing these techniques to synthetic biology, as well as systems biology.
I work at UC Berkeley now, and collaborate with Ras Bodik, Chris J. Anderson (Bioengineering), and Sanjit Seshia. Earlier, I finished my Ph.D. in programming languages at the University of Maryland, College Park, advised by Jeffrey S. Foster, Sumit Gulwani, and Michael W. Hicks.
Connect on LinkedIn
Synthetic biology is the relatively new discipline of genetic modification of (primarily) bacteria and yeast to endow the organism with rationally designed unnatural function. Algorithmic techniques to reason about such modifications are virtually non-existent as of the moment. In (what I like to call) algorthmic synthetic biology, we are building prediction and verification tools for directing experimental work.
My current focus is on building a synthesizer that can predict genetic modifications that will allow organisms to excrete 1-Butanol (i.e., biofuels), Tylenol (i.e., drugs), or Nylon (i.e., polymers). Additionally, we are building a verifier that will validate the correctness (both syntactic, at the DNA level; and semantic, at the protein function level) of synthetic DNA.
Using predictions from our tool, we have already created a living E. coli strain that produces paracetamol (Tylenol).
My research on programming languages focuses on foundational techniques that facilitate program understanding, improve software reliability, and in general, help in building functionally correct software. My techniques for semi-automatic program synthesis and program verification involve reducing the problem to SMT solving.
My PhD dissertation was on program verification and program synthesis using SAT- and SMT-solvers. (The official title being Satisfiability-based Program Reasoning and Program Synthesis.)
Act Synthesizer: Enzymatic Pathway Synthesizer (Act Synthesizer Webpage)
Consider the task of generating a particular chemical of interest, e.g., an antibacterial drug or a biofuel. Either it can be done through traditional, expensive, chemical synthesis processes or it can be biosynthesized. While most chemicals are not naturally synthesized, new PCR and DNA synthesis technology allows experimentalists to transform E. coli or yeast with DNA that allows the organism to do complicated chemistry using new inserted enzymes (produced from the inserted DNA.)
Our task in this project is to generate operators from existing biochemical data, which are then the core operators over which a program is synthesized that when inserted into the E. coli will make it a "chemical microbial factory".
Model synthesis for systems biology
I work with two great students, Ali Sinan Koksal (Ph.D) and Evan Pu (ugrad), on this project. Here we are developing an semi-automatic model synthesizer that takes as input experimental data, and hints about the biological process to synthesize a formally executable/checkable model of the biological process.
Synthetic Biology Verfier (Syntax Verifier Tool)
Current synthetic biology is built around discover-(build-test)* cycles. Our Act Synthesizer project aims to automate the discover step. The build-test cycles are expensive wet-lab processes that sometimes fail due to bad sequence data and lack of understanding of molecular interactions between components. In our verifier project we take a candidate sequence that the experimentalist intends to build in the lab, and run it through a verification process that infers the likelihood of the sequence succeeding.
VS3: Verification and Synthesis using SMT Solvers (VS3 Project Webpage)
We are developing automatic program synthesis and program verification tools in this project. Using techniques described in my thesis we reduce synthesis and verification problems to SMT/SAT instances and use off-the-shelf SMT solvers to solve them and derive the program (when synthesizing) or invariants (when verifying).
In this project we are studying a constraint-based approach to program
analysis. Using the techniques developed in this project, we reduce
program properties to
constraints that can be efficiently solved
using off-the-shelf SMT Solvers, yielding efficient and expressive
program analyses. In contrast to previous approaches that uses
theorem proving to validate we have shown how theorem proving can be
used to instead infer program properties; and with a little bit
more work synthesize the program itself!
CMod: Enforcing modularity in C Code ('05-'07) (CMod Project Webpage)
CMod is a tool written to enforce a modular programming style in C code, to ensure that modules, when linked together, yield a type correct program. By enforcing a few non-intrusive rules, CMod lifts the idiomatic notion of modules in C to the level of a formally sound separate compilation and linking system. We have shown the viability of the system over a million lines of open-source C code.
Resource Allocation in CDMA based Ad hoc Networks ('01-'03)
In this project we developed techniques for more optimally using the wireless bandwidth available by making the MAC layer aware of the underlying CDMA physical layer. Better channel utilization is achieved by integrating information from both layers at the cost of losing the clean separation. Almost all wireless cards have hardwired physical and MAC layers, and therefore this loss of modularity is acceptable.
Act Ontology and Synthesizer
Algorithmic Synthetic Biology
Program Committee and Reviewer
Jay D Sau/Postdoc/Physics