# Zippers

Let’s imagine we’re implementng a text input. What kind of a data definition should we use? Here’s one option.

type BadInput = (String, Int)

Here we interpret `String`

as the content of the input. The `Int`

is the cursor’s position in the input. Suppose now we wish to write an insert function that takes a text input and inserts a character before the cursor.

badPutChar :: BadInput -> Char -> BadInput badPutChar (xs, 0) c = (c : xs, 1) badPutChar (x : xt, k) c = (x : xt', k' + 1) where (xt', k') = badPutChar (xt, k - 1) c

Consider a function that inserts a whole `String`

instead of just a `Char`

.

badPutStr :: BadInput -> String -> BadInput badPutStr input s = foldl badPutChar input s

Maybe you can see why this is not a great representation for the text input. Suppose we’re editing the end of our input. Putting a single character will traverse the entire input to add a character. For `badPutStr`

we traverse the entire input `n`

times where `n`

is the length of the string. That’s terrible!

Can you think of a better representation?

## The List Zipper

type Input = (String, String)

Here the first `String`

will be text before the cursor *reversed*. Slightly weird, but we’ll see why in a minute. The second `String`

is the text after the cursor. To make a connection with our previous structure, we can define a function that can take an `Input`

to a `BadInput`

.

goodToBad :: Input -> BadInput goodToBad (pre, post) = ((reverse pre) ++ post, length pre)

Now we can write `putChar`

and `putStr`

.

putChar :: Input -> Char -> Input putChar (pre, post) c = (c : pre, post) putStr :: Input -> String -> Input putStr input s = foldl putChar input s

Notice how much simpler `putChar`

is! The performance of our program is much better now too. A call to `putChar`

is constant-time. For editing operations, this representation is far superior.

While we used the example of a text input to motivate this representation, this technique is totally general. This would speed up any situation where we have a list of elements that we repeatedly make local edits.

type ListZipper a = ([a], [a])

We call the first component of the tuple the *context* and the second component the *control*. Think of the context as the original data structure with a hole in it. It’s made of a list of *frames* where each frame is a `Cons`

with a hole in it. The control is the thing needed to plug the whole to get the original structure. It turns out that this zipper transformation can work over more types than just lists. You can have a zipper for a tree that can make constant-time local edits.

## Zipping Binary Trees

Here is your run-of-the-mill binary tree.

data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show)

To get a zipper we need a representation for the context.

data HolyNode a = LeftHole a (Tree a) | RightHole a (Tree a) deriving (Show) type Context a = [HolyNode a] type TreeZipper a = (Context a, Tree a)

We can now traverse this structure and make local edits.

goLeft :: TreeZipper a -> TreeZipper a goLeft (ctx, Node x l r) = (LeftHole x r : ctx, l) goRight :: TreeZipper a -> TreeZipper a goRight (ctx, Node x l r) = (RightHole x l : ctx, r) goUp :: TreeZipper a -> TreeZipper a goUp (LeftHole x r : ctx, t) = (ctx, Node x t r) goUp (RightHole x l : ctx, t) = (ctx, Node x l t)

## Data Type Algebra

And now for something completely different.

The unit value

`()`

has type`()`

. I will write unit types as`1`

instead.Booleans have type

`Bool`

. I will write them as`2`

instead.The pair value

`((), ())`

has type`((), ())`

. I will write pairs types with`*`

instead. So, that value now has type`1 * 1`

.The

`Either`

value`Left ()`

has type`Either () b`

. I will denote`Either`

types with`+`

. So, that value now has type`1 + b`

.

In our new notation, this function `f`

has what return type?

f True = Left () f False = Right ()

This should be `1 + 1`

. We’ll also write this as `2`

. In general `1 + ... + 1`

will be written as the number it adds up to. Same for products. One more exercise, what’s the return type of `g`

in our new notation?

g x y = (f x, f y)

This should be `2 * 2 = 4`

. You should verify that if we have a type `n`

, then there are `n`

values of that type. We can see this pretty explicitly with the above example.

g True True = (Left (), Left ()) g True False = (Left (), Right ()) g False True = (Right (), Left ()) g False False = (Right (), Right ())

This seems like a cute notation, but is it more than just cute? Is there something deeper going on here?

## The List Type

Recall the definition of a list type.

data List a = Nil | Cons a (List a)

I’m going to write this in a slightly more primitive way.

data List a = Either () (a, List a)

This is the same type as before except we’re not using Haskell’s nice syntax for defining types. Instead everything is just in terms of the more primitive notions. Here we represent the list `[1, 2, 3]`

as `Right (1, Right (2, Right (3, Left ())))`

.

Remember that list isn’t a type, but a whole family of types. Given an element type, we can give back a particular list type specialized to that list. We can model this as a function over types. In our new notation a list of `a`

has type `L(a)`

where `L(t) = 1 + t * L(t)`

.

## Doing Algebra

Forget we’re talking about types. Let’s do some math.

What is the derivative of

`1 + ax`

with respect to`x`

? Should be`a`

.How about

`1 + ax^2`

with respect to`x`

? That’s`2ax`

.

Do these look familiar? They’re the types of our context frames! This works in general. Let’s try something else.

`L(t) = 1 + t * L(t)`

. Solving for `L`

we get `L(t) = 1 / (1 - t)`

. Does anyone remember the Taylor expansion of this? Yes, `L(t) = 1 + t + t^2 + t^3 + ...`

.

So we’ve derived an alternative definition of `L(t)`

. Turn on your type interpretation again. This is either `1`

which is an empty list, `t`

which is a singleton list, `t^2`

a list with two elements, ... Looks like a list to me!

Try the same thing for binary trees. What do you get? The Catalan numbers for a good reason. `Cn`

is the number of full binary trees with `n + 1`

leaves.

## Additional Reading

For the practical parts about using Haskell for zippers, see LYaH.

Gerard Huet wrote the first zipper paper.

Conor McBride discovered that the type of a zipper is the derivative of its data type.

Erik Hinton presented that paper at the Papers We Love meetup.

There is a nice series on zippers by Pavel Panchekha with even more interesting connections.

Finally, more of the mathematics can be found in a lecture note presented by Mathias Pedersen.