8.17

19 Interpreters🔗

This chapter is adapted from Formal Reasoning About Programs, Chapter 4: Semantics via Interpreters, by Adam Chlipala. Because we are using only two chapters from the original work, we have modified it to remove dependencies on the FRAP libraries. However, the license of the original book does not permit the creation or distribution of modified versions. Accordingly, please do not distribute this material.

We begin with a return to our arithmetic language from BasicSyntax, adding subtraction, which will come in handy later.

Rocq

From Stdlib Require Import List Arith Lia String.

Notation var := string.
Import ListNotations.
Open Scope string_scope.

Inductive arith : Type :=
| Const (n : nat)
| Var (x : var)
| Plus (e1 e2 : arith)
| Minus (e1 e2 : arith)
| Times (e1 e2 : arith).

Example ex1 := Const 42.
Example ex2 := Plus (Var "y") (Times (Var "x") (Const 3)).

The above definition only explains what programs look like. We also care about what they mean.

The natural meaning of an expression is the number it evaluates to. Actually, it’s not quite that simple.

We need to consider the meaning to be a function over a valuation to the variables. valuation is a function from variable names to numbers.

Rocq

Definition valuation := var -> option nat.

Definition empty : valuation := fun _ => None.

Definition lookup (x:var) (v:valuation) : option nat := v x.

Definition update (x:var) (n:nat) (v:valuation) : valuation :=
  fun y =>
    match string_dec x y with
        | left H => Some n
        | right H' => lookup y v
    end.

The function update defines a function that updates an environment by assigning a new value to a variable.
  • x : var — the variable name you want to update

  • n : nat — the value you want to assign

  • v : valuation — the existing environment (a mapping from variables to optional nat values)

  • Result: a new environment valuation that behaves like v, except that x now maps to Some n.

In typical functional programming, this is like:

env'[x ↦ n]

For example:
  • update "x" 10 empty returns a function that takes an input variable y and decides:
    • If y = "x", return Some n — the updated value.

    • If y ≠ "x", return whatever the old environment v had for y.

    In OCaml, the function looks like this:
    
      fun y-> if y = "x" then Some 10 else empty 

  • update "y" 20 (update "x" 10 empty) returns:

    
      fun y-> if y = "y" then 
                Some 20 
              else 
                lookup "x" (fun y-> if y = "x" then Some 10 else empty)

That is, the domain is [var] (a synonym for [string]) and the codomain/range is [nat].

The interpreter is a fairly innocuous-looking recursive function.

Rocq

Fixpoint interp (e : arith) (v :valuation) : nat :=
  match e with
  | Const n => n
  | Var x =>
    match v x with
    | None => 0 (* goofy default value! *)
    | Some n => n
    end
  | Plus e1 e2 => interp e1 v + interp e2 v
  | Minus e1 e2 => interp e1 v - interp e2 v
  | Times e1 e2 => interp e1 v * interp e2 v
  end.

Here’s an example environment.

Rocq

Definition valuation0 :valuation :=
  update "y" 3 (update "x" 17 empty).

Rocq

Theorem interp_ex1 : interp ex1valuation0 = 42.
Proof.
  simpl.
  trivial.
Qed.

Theorem interp_ex2 : interp ex2 valuation0 = 54.
Proof.
  unfold valuation0.
  simpl. trivial.
Qed.

Here’s the silly transformation we defined last time.

Rocq

Fixpoint commuter (e : arith) : arith :=
  match e with
  | Const _ => e
  | Var _ => e
  | Plus e1 e2 => Plus (commuter e2) (commuter e1)
  | Minus e1 e2 => Minus (commuter e1) (commuter e2)
                   (* ^-- NB: didn't change the operand order here! *)
  | Times e1 e2 => Times (commuter e2) (commuter e1)
  end.

Instead of proving various odds-and-ends properties about it, let’s show what we *really* care about: it preserves the meanings of expressions!

Rocq

Theorem commuter_ok : forall v e, interp (commuter e) v = interp e v.
Proof.
  induction e; simpl.

  - trivial.
  - lia.
  - rewrite IHe1, IHe2. lia.
  - rewrite IHe1, IHe2. lia.
  - rewrite IHe1, IHe2. lia.
Qed.

Let’s also revisit substitution.

Rocq

Fixpoint substitute (inThis : arith) (replaceThis : var) (withThis : arith) : arith :=
  match inThis with
  | Const _ => inThis
  | Var x => if (string_dec x replaceThis) then withThis else inThis
  | Plus e1 e2 => Plus (substitute e1 replaceThis withThis) (substitute e2 replaceThis withThis)
  | Minus e1 e2 => Minus (substitute e1 replaceThis withThis) (substitute e2 replaceThis withThis)
  | Times e1 e2 => Times (substitute e1 replaceThis withThis) (substitute e2 replaceThis withThis)
  end.

Theorem substitute_ok : forall v replaceThis withThis inThis,
  interp (substitute inThis replaceThis withThis) v
  = interp inThis (update replaceThis (interp withThis v) v).
Proof.
  induction inThis.
  - simpl. trivial.
  - simpl. unfold update, kookup.
    destruct (string_dec replaceThis x).
    + destruct (string_dec x replaceThis).
      * simpl. congruence.
      * simpl. congruence.
    + destruct (string_dec x replaceThis).
      * simpl. congruence.
      * simpl. congruence.
  - simpl. congruence.
  - simpl. congruence.
  - simpl. congruence.

Qed.

Theorem substitute_ok' : forall v replaceThis withThis inThis,
  interp (substitute inThis replaceThis withThis) v
  = interp inThis (update replaceThis (interp withThis v) v).
Proof.
  induction inThis; simpl;
   try (unfold update, lookup;
    destruct (string_dec replaceThis x);
     destruct (string_dec x replaceThis); simpl); congruence.
Qed.

Let’s also define a pared-down version of the expression-simplification functions from last chapter.

Rocq

Fixpoint doSomeArithmetic (e : arith) : arith :=
  match e with
  | Const _ => e
  | Var _ => e
  | Plus (Const n1) (Const n2) => Const (n1 + n2)
  | Plus e1 e2 => Plus (doSomeArithmetic e1) (doSomeArithmetic e2)
  | Minus e1 e2 => Minus (doSomeArithmetic e1) (doSomeArithmetic e2)
  | Times (Const n1) (Const n2) => Const (n1 * n2)
  | Times e1 e2 => Times (doSomeArithmetic e1) (doSomeArithmetic e2)
  end.

Compute (doSomeArithmetic (Plus (Const 10) (Const 20))).

Theorem doSomeArithmetic_ok : forall e v, interp (doSomeArithmetic e) v = interp e v.
Proof.
  induction e.
  - simpl. congruence.
  -  simpl. congruence.
  - destruct e1, e2; simpl in *; try congruence.
  - destruct e1, e2; simpl in *; try congruence.
  - destruct e1, e2; simpl in *; try congruence.
Qed.

Of course, we’re going to get bored if we confine ourselves to arithmetic expressions for the rest of our journey. Let’s get a bit fancier and define a *stack machine*, related to postfix calculators that some of you may have experienced.

Rocq

Inductive instruction :=
| PushConst (n : nat)
| PushVar (x : var)
| Add
| Subtract
| Multiply.

What does it all mean? An interpreter tells us unambiguously!

Rocq

Definition run1 (i : instruction) (v :valuation) (stack : list nat) : list nat :=
  match i with
  | PushConst n => n :: stack
  | PushVar x => (match lookup x v with
                  | None => 0
                  | Some n => n
                  end) :: stack
  | Add =>
    match stack with
    | arg2 :: arg1 :: stack' => arg1 + arg2 :: stack'
    | _ => stack (* arbitrary behavior in erroneous case (stack underflow) *)
    end
  | Subtract =>
    match stack with
    | arg2 :: arg1 :: stack' => arg1 - arg2 :: stack'
    | _ => stack (* arbitrary behavior in erroneous case *)
    end
  | Multiply =>
    match stack with
    | arg2 :: arg1 :: stack' => arg1 * arg2 :: stack'
    | _ => stack (* arbitrary behavior in erroneous case *)
    end
  end.

That function explained how to run one instruction. Here’s how to run several of them.

Rocq

Fixpoint run (is : list instruction) (v : valuation) (stack : list nat) : list nat :=
  match is with
  | nil => stack
  | i :: is' => run is' v (run1 i v stack)
  end.

Instead of writing fiddly stack programs ourselves, let’s *compile* arithmetic expressions into equivalent stack programs.

Rocq

Fixpoint compile (e : arith) : list instruction :=
  match e with
  | Const n => PushConst n :: nil
  | Var x => PushVar x :: nil
  | Plus e1 e2 => compile e1 ++ compile e2 ++ Add :: nil
  | Minus e1 e2 => compile e1 ++ compile e2 ++ Subtract :: nil
  | Times e1 e2 => compile e1 ++ compile e2 ++ Multiply :: nil
  end.

Now, of course, we should prove our compiler correct. Skip down to the next theorem to see the overall correctness statement. It turns out that we need to strengthen the induction hypothesis with a lemma, to push the proof through.

Rocq

Lemma compile_ok' : forall e v is stack,
    run (compile e ++ is) v stack = run is v (interp e v :: stack).
Proof.
  induction e; intros; simpl.
  - trivial.
  - trivial.
  (* Here we want to use associativity of [++], to get the conclusion to match
   * an induction hypothesis.  Let's ask Rocq to search its library for lemmas
   * that would justify such a rewrite, giving a pattern with wildcards, to
   * specify the essential structure that the rewrite should match. *)
  Search ((_ ++ _) ++ _).
  (* Ah, we see just the one! *)
  - rewrite <- app_assoc.
    rewrite IHe1.
    rewrite <- app_assoc.
    rewrite IHe2.
    simpl.
    trivial.

- rewrite <- app_assoc.
  rewrite IHe1.
  rewrite <- app_assoc.
  rewrite IHe2.
  simpl.
  trivial.

- rewrite <- app_assoc.
  rewrite IHe1.
  rewrite <- app_assoc.
  rewrite IHe2.
  simpl.
  trivial.
Qed.

The overall theorem follows as a simple corollary.

Rocq

Theorem compile_ok : forall e v, run (compile e) v nil = interp e v :: nil.
Proof.
  intros. simpl.

  (* To match the form of our lemma, we need to replace [compile e] with
   * [compile e ++ nil], adding a "pointless" concatenation of the empty list.
   * [Search] again helps us find a library lemma. *)
  Search (_ ++ nil).
  rewrite <- (app_nil_r (compile e)).
  (* Note that we can use [rewrite] with explicit values of the first few
   * quantified variables of a lemma.  Otherwise, [rewrite] picks an
   * unhelpful place to rewrite.  (Try it and see!) *)

  apply compile_ok'.
  (* Direct appeal to a previously proved lemma *)
Qed.

Let’s get a bit fancier, moving toward the level of general-purpose imperative languages. Here’s a language of commands, building on the language of expressions we have defined.

Rocq

Inductive cmd :=
| Skip
| Assign (x : var) (e : arith)
| Sequence (c1 c2 : cmd)
| Repeat (e : arith) (body : cmd).

That last constructor is for repeating a body command some number of times. Note that we sneakily avoid constructs that could introduce nontermination, since Rocq only accepts terminating programs, and we want to write an interpreter for commands. In contrast to our last one, this interpreter *transforms valuations*. We use a helper function for self-composing a function some number of times.

Rocq

Fixpoint selfCompose {A} (f : A -> A) (n : nat) : A -> A :=
  match n with
  | O => fun x => x
  | S n' => fun x => selfCompose f n' (f x)
  end.

Fixpoint exec (c : cmd) (v : valuation) : valuation :=
  match c with
  | Skip => v
  | Assign x e => update x (interp e v) v
  | Sequence c1 c2 => exec c2 (exec c1 v)
  | Repeat e body => selfCompose (exec body) (interp e v) v
  end.

One last example: let’s try to do loop unrolling, for constant iteration counts. That is, we can duplicate the loop body instead of using an explicit loop.

Rocq

Fixpoint seqself (c : cmd) (n : nat) : cmd :=
  match n with
  | O => Skip
  | S n' => Sequence c (seqself c n')
  end.

Fixpoint unroll (c : cmd) : cmd :=
  match c with
  | Skip => c
  | Assign _ _ => c
  | Sequence c1 c2 => Sequence (unroll c1) (unroll c2)
  | Repeat (Const n) c1 => seqself (unroll c1) n
  (* ^-- the crucial case! *)
  | Repeat e c1 => Repeat e (unroll c1)
  end.

This obvious-sounding fact will come in handy: self-composition gives the same result, when passed two functions that map equal inputs to equal outputs.

Rocq

Lemma selfCompose_extensional : forall {A} (f g : A -> A) n x,
  (forall y, f y = g y)
  -> selfCompose f n x = selfCompose g n x.
Proof.
  induction n.
  - simpl. intros. reflexivity.
  - intros. simpl. rewrite H.  apply IHn. assumption.
Qed.

Crucial lemma: [seqself] is acting just like [selfCompose], in a suitable sense.

Rocq

Lemma seqself_ok : forall c n v,
  exec (seqself c n) v = selfCompose (exec c) n v.
Proof.
  induction n; simpl; congruence.
Qed.

The two lemmas we just proved are the main ingredients to prove the natural correctness condition for [unroll].

Rocq

Theorem unroll_ok : forall c v, exec (unroll c) v = exec c v.
Proof.
  induction c; simpl; try reflexivity.
  - congruence.
  - intros. destruct e; simpl; try congruence.
   + rewrite seqself_ok.
  apply selfCompose_extensional.
  trivial.

  + apply selfCompose_extensional.
  trivial.

  + apply selfCompose_extensional.
  trivial.

  + apply selfCompose_extensional.
  trivial.

  + apply selfCompose_extensional.
  trivial.
Qed.