Zippers

Let’s imag­ine we’re imple­mentng a text input. What kind of a data def­i­n­i­tion should we use? Here’s one option.

type Bad­Input = (String, Int)

Here we inter­pret String as the con­tent of the input. The Int is the cur­sor’s posi­tion in the input. Sup­pose now we wish to write an insert func­tion that takes a text input and inserts a char­ac­ter before the cur­sor.

bad­Put­Char :: Bad­Input -> Char -> Bad­Input
bad­Put­Char (xs, 0) c = (c : xs, 1)
bad­Put­Char (x : xt, k) c =
  (x : xt', k' + 1)
  where (xt', k') = bad­Put­Char (xt, k - 1) c

Con­sider a func­tion that inserts a whole String instead of just a Char.

bad­Put­Str :: Bad­Input -> String -> Bad­Input
bad­Put­Str input s = foldl bad­Put­Char input s

Maybe you can see why this is not a great rep­re­sen­ta­tion for the text input. Sup­pose we’re edit­ing the end of our input. Putting a sin­gle char­ac­ter will tra­verse the entire input to add a char­ac­ter. For bad­Put­Str we tra­verse the entire input n times where n is the length of the string. That’s ter­ri­ble!

Can you think of a bet­ter rep­re­sen­ta­tion?

The List Zipper

type Input = (String, String)

Here the first String will be text before the cur­sor reversed. Slightly weird, but we’ll see why in a minute. The sec­ond String is the text after the cur­sor. To make a con­nec­tion with our pre­vi­ous struc­ture, we can define a func­tion that can take an Input to a Bad­Input.

good­To­Bad :: Input -> Bad­Input
good­To­Bad (pre, post) = ((reverse pre) ++ post, length pre)

Now we can write put­Char and put­Str.

put­Char :: Input -> Char -> Input
put­Char (pre, post) c = (c : pre, post)

put­Str :: Input -> String -> Input
put­Str input s = foldl put­Char input s

Notice how much sim­pler put­Char is! The per­for­mance of our pro­gram is much bet­ter now too. A call to put­Char is con­stant-time. For edit­ing oper­a­tions, this rep­re­sen­ta­tion is far supe­rior.

While we used the exam­ple of a text input to moti­vate this rep­re­sen­ta­tion, this tech­nique is totally gen­eral. This would speed up any sit­u­a­tion where we have a list of ele­ments that we repeat­edly make local edits.

type List­Zip­per a = ([a], [a])

We call the first com­po­nent of the tuple the con­text and the sec­ond com­po­nent the con­trol. Think of the con­text as the orig­i­nal data struc­ture with a hole in it. It’s made of a list of frames where each frame is a Cons with a hole in it. The con­trol is the thing needed to plug the whole to get the orig­i­nal struc­ture. It turns out that this zip­per trans­for­ma­tion can work over more types than just lists. You can have a zip­per for a tree that can make con­stant-time local edits.

Zipping Binary Trees

Here is your run-of-the-mill binary tree.

data Tree a = Empty | Node a (Tree a) (Tree a) deriv­ing (Show)

To get a zip­per we need a rep­re­sen­ta­tion for the con­text.

data Holy­Node a = Left­Hole a (Tree a) | Right­Hole a (Tree a) deriv­ing (Show)
type Con­text a = [Holy­Node a]
type Tree­Zip­per a = (Con­text a, Tree a)

We can now tra­verse this struc­ture and make local edits.

go­Left :: Tree­Zip­per a -> Tree­Zip­per a
go­Left (ctx, Node x l r) = (Left­Hole x r : ctx, l)

go­Right :: Tree­Zip­per a -> Tree­Zip­per a
go­Right (ctx, Node x l r) = (Right­Hole x l : ctx, r)

go­Up :: Tree­Zip­per a -> Tree­Zip­per a
go­Up (Left­Hole x r : ctx, t) = (ctx, Node x t r)
go­Up (Right­Hole x l : ctx, t) = (ctx, Node x l t)

Data Type Algebra

And now for some­thing com­pletely dif­fer­ent.

In our new nota­tion, this func­tion f has what return type?

f True = Left ()
f False = Right ()

This should be 1 + 1. We’ll also write this as 2. In gen­eral 1 + ... + 1 will be writ­ten as the num­ber it adds up to. Same for prod­ucts. One more exer­cise, what’s the return type of g in our new nota­tion?

g x y = (f x, f y)

This should be 2 * 2 = 4. You should ver­ify that if we have a type n, then there are n val­ues of that type. We can see this pretty explic­itly with the above exam­ple.

g True True = (Left (), Left ())
g True False = (Left (), Right ())
g False True = (Right (), Left ())
g False False = (Right (), Right ())

This seems like a cute nota­tion, but is it more than just cute? Is there some­thing deeper going on here?

The List Type

Recall the def­i­n­i­tion of a list type.

data List a = Nil | Cons a (List a)

I’m going to write this in a slightly more prim­i­tive way.

data List a = Either () (a, List a)

This is the same type as before except we’re not using Haskell’s nice syn­tax for defin­ing types. Instead every­thing is just in terms of the more prim­i­tive notions. Here we rep­re­sent the list [1, 2, 3] as Right (1, Right (2, Right (3, Left ()))).

Remem­ber that list isn’t a type, but a whole fam­ily of types. Given an ele­ment type, we can give back a par­tic­u­lar list type spe­cial­ized to that list. We can model this as a func­tion over types. In our new nota­tion a list of a has type L(a) where L(t) = 1 + t * L(t).

Doing Algebra

For­get we’re talk­ing about types. Let’s do some math.

Do these look famil­iar? They’re the types of our con­text frames! This works in gen­eral. Let’s try some­thing else.

L(t) = 1 + t * L(t). Solv­ing for L we get L(t) = 1 / (1 - t). Does any­one remem­ber the Tay­lor expan­sion of this? Yes, L(t) = 1 + t + t^2 + t^3 + ....

So we’ve derived an alter­na­tive def­i­n­i­tion of L(t). Turn on your type inter­pre­ta­tion again. This is either 1 which is an empty list, t which is a sin­gle­ton list, t^2 a list with two ele­ments, ... Looks like a list to me!

Try the same thing for binary trees. What do you get? The Cata­lan num­bers for a good rea­son. Cn is the num­ber of full binary trees with n + 1 leaves.

Additional Reading