On this page:
Symbols
Pointer equality
Interning symbols
Generating symbols
Bonus
Testing
Submitting
7.4

Assignment 7: Symbols, interning, and gensym

Due: Tues, Nov 12, 11:59PM

The goal of this assignment is to (1) implement symbols and the eq? primitive operation, (2) to implement symbol interning by program transformation.

Assignment repository:

You are given a repository with a starter compiler similar to the Loot language we studied in class.

The given code also implements all the “plus” features we’ve developed in past assignments.

Symbols

Your first task is to implement symbols for the Loot+ language. You’ve used symbols extensively throughout the semester, so their use should be familiar to you. A symbol evaluates to itself:

Examples

> 'foo

'foo

Your first task is to implement a symbol data type. The given code includes syntax checking for programs that may contain symbols and run-time support for printing symbols. The compiler has been stubbed for compiling symbols. You will need to implement compile-symbol in compile.rkt.

A symbol can be represented much like a string: as a continuous sequence of characters in memory, along with a length field. The type tag is different, since strings and symbols should be disjoint data types.

Once you implement compile-symbol, you should be able to write programs that contain symbols.

Pointer equality

Your next task is to implement the eq? primitive operation, which compares two values for pointer equality. Immediate values (characters, integers, booleans, empty list, etc.) should be pointer-equal to values that are “the same.” So for example:

Examples

> (eq? '() '())

#t

> (eq? 5 5)

#t

> (eq? #\a #\a)

#t

> (eq? #\t #\t)

#t

On the other hand, values that are allocated in memory such as boxes, pairs, procedures, etc., are only eq? to each other if they are allocated to the same location in memory. So for example, the following could all produce #f:

Examples

> (eq? (λ (x) x) (λ (x) x))

#f

> (eq? (cons 1 2) (cons 1 2))

#f

> (eq? (box 1) (box 1))

#f

However these must be produce #t:

Examples

> (let ((x (λ (x) x)))
    (eq? x x))

#t

> (let ((x (cons 1 2)))
    (eq? x x))

#t

> (let ((x (box 1)))
    (eq? x x))

#t

Applying eq? to any two values from disjoint data types should produce #f:

Examples

> (eq? 0 #f)

#f

> (eq? #\a "a")

#f

> (eq? '() #t)

#f

> (eq? 'fred "fred")

#f

The given compiler is stubbed for the eq? primitive. You must implement compile-eq?.

Interning symbols

One thing you may notice at this point is that because symbols are allocated in memory, the behavior eq? with your compiler differs from Racket’s behavior.

In Racket, two symbols which are written the same way in a given program are eq? to each other.

Examples

> (eq? 'x 'x)

#t

But your compiler will (probably) produce #f.

The problem is that Racket “interns” symbols, meaning that all occurrences of a symbol are allocated to the same memory location. (Languages like Java also do this with string literals.)

Extend your compiler so that eq? behaves correctly on symbols. Note, you should not change the way eq? works, rather you should change how symbols are handled by the compiler.

The most effective way to implement symbol interning is to apply a program transformation to the given program to compile. This transformation should replace multiple occurrences of the same symbol with a variable that is bound to that symbol, and that symbol should be allocated exactly once.

So for example,

(eq? 'fred 'fred)

could be transformed to:

(let ((x 'fred)) (eq? x x))

The latter should result in #t since the 'fred symbol is allocated exactly once.

The compiler uses a intern-symbols function, which does nothing in the given code, but should be re-defined to perform the symbol interning program transformation. Note: you probably want to define a few helper functions to make intern-symbols work.

Generating symbols

Finally, implement the gensym primitive, which generates a symbol distinct from all other symbols.

To keep things simple, you should implement the nullary version of gensym, i.e. it should take zero arguments and produce a new symbol.

The following program should always produce #f:

Examples

> (eq? (gensym) (gensym))

#f

But the following should always produce #t:

Examples

> (let ((x (gensym)))
    (eq? x x))

#t

Note: Racket’s gensym will generate a new name for a symbol, usually something like 'g123456, where each successive call to gensym will produce 'g123457, 'g123458, 'g123459, etc. Yours does not have to do this (although it’s fine if it does). All that matters is that gensym produces a symbol that is not eq? to any other symbol but itself.

Bonus

Should you find yourself having completed the assignment with time to spare, you could try implementing compile-tail-apply, which compiles uses of apply that appear in tail position. It is currently defined to use the non-tail-call code generator, which means apply does not make a proper tail call.

Keep in mind that this language, the subexpression of apply are arbitrary expressions: (apply e0 e1) and that e0 may evaluate to a closure, i.e. a function with a saved environment. Moreover, the function may have been defined to have variable arity. All of these issues will conspire to make tail calls with apply tricky to get right.

This isn’t worth any credit, but you might learn something.

Testing

You can test your code in several ways:

Note that only a small number of tests are given to you, so you should write additional test cases.

There is separate a repository for tests! When you push your code, Travis will automatically run your code against the tests. If you would like to run the tests locally, clone the following repository into the directory that contains your compiler and run raco test . to test everything:

https://github.com/cmsc430/assign07-test.git

This repository will evolve as the week goes on, but any time there’s a significant update it will be announced on Piazza.

Submitting

Pushing your local repository to github “submits” your work. We will grade the latest submission that occurs before the deadline.