10 Extort: when errors exist
The greatest mistake is to imagine that we never err.
10.1 Errors
We have added multiple, disjoint types, but mostly swept issues of errors under the rug by considering type mismatches as meaningless. Now let’s redesign the semantics to specify the error behavior of such programs.
We’ll call it Extort.
Nothing changes in the syntax of Extort from Dupe, although we will need to talk about two kinds of results from evaluating programs: values and errors. We will say that evaluation produces an answer, which is either a value or error:
10.2 Meaning of Extort programs
Languages adopt several approaches to type mismatches:
Prohibit such programs statically with a type system (e.g. OCaml, Java)
Coerce values to different types (e.g. JavaScript)
Signal a run-time error (e.g. Racket)
Leave the behavior unspecified (e.g. Scheme, C)
We’ve previously seen the last approach. Now let’s do what Racket does and signal an error.
The meaning of Extort programs that have type errors will now be defined as 'err:
Now what does the semantics say about (add1 #f)? What about (if 7 #t -2)?
The signature of the interpreter is extended to produce answers. Each use of a Racket primitive is guarded by checking the type of the arguments and an error is produced if the check fails. Errors are also propagated when a subexpression produces an error:
#lang racket (provide interp) (require "ast.rkt" "interp-prim.rkt") ;; type Answer = Value | 'err ;; type Value = ;; | Integer ;; | Boolean ;; | Character ;; | Eof ;; | Void ;; Expr -> Answer (define (interp e) (match e [(Int i) i] [(Bool b) b] [(Char c) c] [(Eof) eof] [(Prim0 p) (interp-prim0 p)] [(Prim1 p e0) (match (interp e0) ['err 'err] [v (interp-prim1 p v)])] [(If e1 e2 e3) (match (interp e1) ['err 'err] [v (if v (interp e2) (interp e3))])] [(Begin e1 e2) (match (interp e1) ['err 'err] [_ (interp e2)])]))
#lang racket (provide interp-prim0 interp-prim1) ;; Op0 -> Answer (define (interp-prim0 op) (match op ['read-byte (read-byte)] ['peek-byte (peek-byte)] ['void (void)])) ;; Op1 Value -> Answer (define (interp-prim1 op v) (match op ['add1 (if (integer? v) (add1 v) 'err)] ['sub1 (if (integer? v) (sub1 v) 'err)] ['zero? (if (integer? v) (zero? v) 'err)] ['char? (char? v)] ['char->integer (if (char? v) (char->integer v) 'err)] ['integer->char (if (codepoint? v) (integer->char v) 'err)] ['eof-object? (eof-object? v)] ['write-byte (if (byte? v) (write-byte v) 'err)])) ;; Any -> Boolean (define (codepoint? v) (and (integer? v) (or (<= 0 v 55295) (<= 57344 v 1114111))))
We can confirm the interpreter computes the right result for the examples given earlier:
Examples
> (interp (Prim1 'add1 (Bool #f))) 'err
> (interp (Prim1 'zero? (Bool #t))) 'err
> (interp (If (Prim1 'zero? (Bool #f)) (Int 1) (Int 2))) 'err
The statement of correctness stays the same, but now observe that there is no way to crash the interpreter with any Expr value.
10.3 A Compiler for Extort
Suppose we want to compile (add1 #f), what needs to happen? Just as in the interpreter, we need to check the integerness of the argument’s value before doing the addition operation.
We extend the run-time system with a C function called error that prints "err" and exits:
#include <stdio.h> #include <inttypes.h> #include <stdlib.h> #include "types.h" #include "runtime.h" void print_result(int64_t); void print_char(int64_t); void error_exit() { "err\n"); printf(1); exit( } void raise_error() { return error_handler(); } int main(int argc, char** argv) { in = stdin; out = stdout; error_handler = &error_exit; print_result(entry());return 0; } void print_result(int64_t result) { if (int_type_tag == (int_type_mask & result)) { "%" PRId64 "\n", result >> int_shift); printf(else if (char_type_tag == (char_type_mask & result)) { } print_char(result);"\n"); printf(else { } switch (result) { case val_true: "#t\n"); break; printf(case val_false: "#f\n"); break; printf(case val_eof: "#<eof>\n"); break; printf(case val_void: /* nothing */ break; } } }
The compiler now emits code to check the type of arguments:
#lang racket (provide (all-defined-out)) (require "ast.rkt" "types.rkt" a86/ast) ;; Registers used (define rax 'rax) (define rbx 'rbx) (define rsp 'rsp) (define rdi 'rdi) ;; Expr -> Asm (define (compile e) (prog (Extern 'peek_byte) (Extern 'read_byte) (Extern 'write_byte) (Extern 'raise_error) (Label 'entry) (Sub rsp 8) (compile-e e) (Add rsp 8) (Ret) ;; Error handler (Label 'err) (Call 'raise_error))) ;; Expr -> Asm (define (compile-e e) (match e [(Int i) (compile-value i)] [(Bool b) (compile-value b)] [(Char c) (compile-value c)] [(Eof) (compile-value eof)] [(Prim0 p) (compile-prim0 p)] [(Prim1 p e) (compile-prim1 p e)] [(If e1 e2 e3) (compile-if e1 e2 e3)] [(Begin e1 e2) (compile-begin e1 e2)])) ;; Value -> Asm (define (compile-value v) (seq (Mov rax (value->bits v)))) ;; Op0 -> Asm (define (compile-prim0 p) (match p ['void (seq (Mov rax val-void))] ['read-byte (seq (Call 'read_byte))] ['peek-byte (seq (Call 'peek_byte))])) ;; Op1 Expr -> Asm (define (compile-prim1 p e) (seq (compile-e e) (compile-op1 p))) ;; Op1 -> Asm (define (compile-op1 p) (match p ['add1 (seq assert-integer (Add rax (value->bits 1)))] ['sub1 (seq assert-integer (Sub rax (value->bits 1)))] ['zero? (let ((l1 (gensym))) (seq assert-integer (Cmp rax 0) (Mov rax val-true) (Je l1) (Mov rax val-false) (Label l1)))] ['char? (let ((l1 (gensym))) (seq (And rax mask-char) (Xor rax type-char) (Cmp rax 0) (Mov rax val-true) (Je l1) (Mov rax val-false) (Label l1)))] ['char->integer (seq assert-char (Sar rax char-shift) (Sal rax int-shift))] ['integer->char (seq assert-codepoint (Sar rax int-shift) (Sal rax char-shift) (Xor rax type-char))] ['eof-object? (let ((l1 (gensym))) (seq (Cmp rax val-eof) (Mov rax val-true) (Je l1) (Mov rax val-false) (Label l1)))] ['write-byte (seq assert-byte (Mov rdi rax) (Call 'write_byte) (Mov rax val-void))])) ;; Expr Expr Expr -> Asm (define (compile-if e1 e2 e3) (let ((l1 (gensym 'if)) (l2 (gensym 'if))) (seq (compile-e e1) (Cmp rax val-false) (Je l1) (compile-e e2) (Jmp l2) (Label l1) (compile-e e3) (Label l2)))) ;; Expr Expr -> Asm (define (compile-begin e1 e2) (seq (compile-e e1) (compile-e e2))) (define (assert-type mask type) (seq (Mov rbx rax) (And rbx mask) (Cmp rbx type) (Jne 'err))) (define assert-integer (assert-type mask-int type-int)) (define assert-char (assert-type mask-char type-char)) (define assert-codepoint (let ((ok (gensym))) (seq assert-integer (Cmp rax (value->bits 0)) (Jl 'err) (Cmp rax (value->bits 1114111)) (Jg 'err) (Cmp rax (value->bits 55295)) (Jl ok) (Cmp rax (value->bits 57344)) (Jg ok) (Jmp 'err) (Label ok)))) (define assert-byte (seq assert-integer (Cmp rax (value->bits 0)) (Jl 'err) (Cmp rax (value->bits 255)) (Jg 'err)))
Examples
> (displayln (asm-string (compile (Prim1 'add1 (Bool #f)))))
global entry
default rel
section .text
extern peek_byte
extern read_byte
extern write_byte
extern raise_error
entry:
sub rsp, 8
mov rax, 7
mov rbx, rax
and rbx, 1
cmp rbx, 0
jne err
add rax, 2
add rsp, 8
ret
err:
call raise_error
Examples
> (define (tell e) (match (asm-interp (compile (parse e))) ['err 'err] [b (bits->value b)])) > (tell #t) #t
> (tell #f) #f
> (tell '(zero? 0)) #t
> (tell '(zero? -7)) #f
> (tell '(if #t 1 2)) 1
> (tell '(if #f 1 2)) 2
> (tell '(if (zero? 0) (if (zero? 0) 8 9) 2)) 8
> (tell '(if (zero? (if (zero? 2) 1 0)) 4 5)) 4
> (tell '(add1 #t)) 'err
> (tell '(sub1 (add1 #f))) 'err
> (tell '(if (zero? #t) 1 2)) 'err
Since the interpreter and compiler have well defined specifications for what should happen when type errors occur, we can test in the usual way again:
Examples
> (define (check-correctness e) (check-equal? (match (asm-interp (compile e)) ['err 'err] [b (bits->value b)]) (interp e) e)) > (check-correctness (Prim1 'add1 (Int 7))) > (check-correctness (Prim1 'add1 (Bool #f)))