On this page:
4.1 Let’s Agree
4.2 Abstract syntax for Agreement
4.3 Meaning of Agreement programs
4.4 Running an Agreement
4.5 Compiling an Agreement
7.4

4 Agreement: a language of numbers between friends

It’s just a phase!

    4.1 Let’s Agree

    4.2 Abstract syntax for Agreement

    4.3 Meaning of Agreement programs

    4.4 Running an Agreement

    4.5 Compiling an Agreement

4.1 Let’s Agree

After looking at Abscond: a language of numbers it may seem a little odd that we are building both a compiler and an interpreter. Furthermore, the difference between the two may not be immediately clear in our first encounter with both.

When implementing a language there are many reasons for choosing to implement an interpreter or a compiler or, increasingly, both1In fact, many modern language implementations have a compiler as part of their interpreter, this is is known as JIT compilation.

1In fact, many modern language implementations have a compiler as part of their interpreter, this is is known as JIT compilation.

4.2 Abstract syntax for Agreement

Abscond was a simple language with the following grammar:

; type expr = integer

An Agreement program, like and Abscond program, consists of a single expression, and the grammar of expressions introduces one new concept:

; type expr = integer
;           | get-int

So, 0, 120, -42, are programs, just as in Abscond, but so is get-int. get-int is a primitive (something that we the language implementers provide in the language) that retrieves an integer from standard input (i.e. the user provides it).

Now that our programs can take more than one shape (barely!) it’s a good time to make our Abstract Syntax Tree explicit:

agreement/ast.rkt

  #lang racket
  (provide (all-defined-out))
   
  ; type Expr = integer
  ;          | get-int
  (struct int-e (i))
  (struct get-i ())
   
  (define (print-ast a)
    (match a
      [(int-e i) `(int-e ,i)]
      [(get-i)    'get-i]))
   

We are using Racket’s struct feature to provide the nodes of our AST. The (int-e) (pronounced ‘int expression’) struct takes a single argument: the integer value, while the (get-int) node takes no arguments as we don’t know what value it will represent.

4.3 Meaning of Agreement programs

We can write an “interpreter” that consumes an expression and produces it’s meaning:

agreement/interp.rkt

  #lang racket
  (provide (all-defined-out))
   
  (require "primitives.rkt")
  (require "ast.rkt")
   
  ; Expr -> Integer
  ; Interpret given expression
  ;
  (define (interp p)
    (match p
      [(int-e i) i]
      [(get-i)   (get-int)]))
   

Notice that the meaning of get-int is ‘just’ the call to the racket function get-int. Understanding how that function is implemented is not important for our purposes and we can treat it as a black-box.

Examples:
> (interp (parse 42))

42

> (interp (parse -8))

-8

Earlier I mentioned that the compiler and interpreter differ in how things run. Notice that for the above, we need a working Racket system in order to run the programs through the interpreter and get a result. Soon we will contrast this with how we run the program that the compiler produces.

We can add a command line wrapper program for interpreting Agreement programs saved in files:

agreement/interp-file.rkt

  #lang racket
  (provide (all-defined-out))
  (require "interp.rkt")
  (require "parse.rkt")
   
  ;; String -> Void
  ;; Parse and interpret contents of given filename,
  ;; print result on stdout
  (define (main fn)
    (begin
    (define prog (with-input-from-file fn
      (λ ()
        (writeln "Interpreting..." (current-output-port))
        (let ((c (read-line)) ; ignore #lang racket line
              (p (read)))
          (parse p)))))
    (interp prog)))
   

For example:

shell

> echo '#lang racket\nget-int' > example.agr
> echo 1024 | racket -t interp-file.rkt -m example.agr
"Interpreting..."
1024
> echo 2048 | racket -t interp-file.rkt -m example.agr
"Interpreting..."
2048

A few observations

Unlike Abscond, we will not be providing an Operational Semantics for agreement, as to do so fully would require some more advanced techniques that are more appropriate for a latter stage of the course.

4.4 Running an Agreement

Unlike in our interpreter, we cannot use a Racket function for get-int. This is because Racket functions are not available to use when we execute our assembly programs. So instead we expand our runtime-system to provide us with this functionality:

agreement/main.c

#include <stdio.h>
#include <inttypes.h>

int64_t entry();

int main(int argc, char** argv) {
  int64_t result = entry();
  printf("result: %" PRId64 "\n", result); return 0;
}

int64_t get_int() {
  int64_t x;
  scanf("%lld", &x);
  return x;
} 

The other side of this coin is that we do not need Racket in order to run our programs, only to compile them.

In Agreement, the runtime system calls the function and prints the result as before, but now it also provides us with the implementation of the get-int primitive operation.

4.5 Compiling an Agreement

The distinction between the compiler for Abscond and for Agreement is not important for this stage in the course, and as such we omit the details here. Later in the semester we will cover the techniques used in compiling this primitive operation.

That said, we can still observe some important points from running some examples. Let’s start with compiling the same program we interpreted.

shell

> echo '#lang racket\nget-int' > example.rkt
> make example.run
make[1]: Entering directory `/home/travis/build/cmsc430/www/www/notes/agreement'
racket -t compile-file.rkt -m example.rkt > example.s
nasm -f elf64 -o example.o example.s
gcc main.o example.o -o example.run
rm example.o example.s
make[1]: Leaving directory `/home/travis/build/cmsc430/www/www/notes/agreement'
"Compiling..."

Notice that we now see “Compiling...” when our compiler is run on the input program. Let’s run the compiled program now:

shell

> echo '#lang racket\nget-int' > example.rkt
> make example.run
make[1]: Entering directory `/home/travis/build/cmsc430/www/www/notes/agreement'
racket -t compile-file.rkt -m example.rkt > example.s
nasm -f elf64 -o example.o example.s
gcc main.o example.o -o example.run
rm example.o example.s
make[1]: Leaving directory `/home/travis/build/cmsc430/www/www/notes/agreement'
"Compiling..."
> echo 1024 | ./example.run
result: 1024
> echo 2028 | ./example.run
result: 2028

Even though we run the program multiple times, we only see “Compiling...” when we run our compiler. This helps us illustrate an important aspect of compilation. For the most part, you compile a program once, but can run it many times. In our case, because the target language is x86_64, we do not never need Racket to be installed in order to run our programs. This means that we can compile our programs on a system with a Racket environment, but run the program on any system that is compatible with the executable format we get.

This is very different from how our interpreter works. In order to run a program with our interpreter, we need Racket!

Below we have the rest of the necessary files for Agreement, in case you’re interested:

agreement/asm/printer.rkt

  #lang racket
  (provide (all-defined-out))
   
  ;; Asm -> String
  (define (asm->string a)
    (foldr (λ (i s) (string-append (instr->string i) s)) "" a))
   
  ;; Instruction -> String
  (define (instr->string i)
    (match i
      [`(mov ,a1 ,a2)
       (string-append "\tmov " (arg->string a1) ", " (arg->string a2) "\n")]
      [`(call ,ad)
       (string-append "\tcall " (label->string ad) "\n")]
      [`(and ,a1 ,a2)
       (string-append "\tand " (arg->string a1) ", " (arg->string a2) "\n")]
      [`ret "\tret\n"]
      [l (string-append (label->string l) ":\n")]))
   
  ;; Arg -> String
  (define (arg->string a)
    (match a
      [`rax "rax"]
      [`rsp "rsp"]
      [`r15 "r15"]
      [n (number->string n)]))
   
  ;; Label -> String
  ;; prefix with _ for Mac
  (define label->string
    (match (system-type 'os)
      ['macosx
       (λ (s) (string-append "_" (symbol->string s)))]
      [_ symbol->string]))
   
  ;; Asm -> Void
  (define (asm-display a)
    ;; entry point will be first label
    (let ((g (findf symbol? a)))
      (display 
        (string-append "\tglobal " (label->string g) "\n"
              "\textern " (label->string 'get_int) "\n"
              "\tsection .text\n"
                       (asm->string a)))))
   

agreement/compile-file.rkt

  #lang racket
  (provide (all-defined-out))
  (require "compile.rkt" "asm/printer.rkt" "parse.rkt")
   
  ;; String -> Void
  ;; Compile contents of given file name,
  ;; emit asm code on stdout
  (define (main fn)
    (with-input-from-file fn
      (λ ()
        (writeln "Compiling..." (current-error-port))
        (let ((c (read-line)) ; ignore #lang racket line
              (p (read)))
          (asm-display (compile (parse p)))))))
   

agreement/Makefile

UNAME := $(shell uname)
.PHONY: test
.PHONY: clean

ifeq ($(UNAME), Darwin)
  format=macho64
else
  format=elf64
endif

%.run: %.o main.o
    gcc main.o $< -o $@

main.o: main.c
    gcc -c main.c -o main.o

%.o: %.s
    nasm -f $(format) -o $@ $<

%.s: %.rkt
    racket -t compile-file.rkt -m $< > $@

clean:
    rm *.o *.s *.run

test: 42.run
    @test "$(shell ./42.run)" = "42"