Functors

This lec­ture is adjusted from Niki Vazou who adjusted from Ran­jit Jhala who adjusted from Gra­ham Hut­ton.

Mapping Over Lists

As a brief review, let’s recall map­ping over lists. Let’s sup­pose we want to take a list of inte­gers and add one to each inte­ger in the list. We might define that as fol­lows

inc :: [Int] -> [Int] -- Note: [Int] is spe­cial syn­tax for [] Int
inc []     = []
inc (i:is) = i+1 : inc is

When we load this func­tion into our trusty Haskell REPL we get

Pre­lude> inc [1,2,3,4]
[2,3,4,5]

Let’s now sup­pose we want to take a list of inte­gers and square each inte­ger in the list. We might write the fol­low­ing:

square :: [Int] -> [Int]
square []     = []
square (i:is) = i^2 : square is

In our REPL we get

Pre­lude> square [1,2,3,4]
[1,4,9,16]

Being good pro­gram­mers, we real­ize there is a com­mon pat­tern in these two tasks. In par­tic­u­lar, we see that there is a func­tion f (either inc or square that we want to apply to each ele­ment in the list. There­fore, we write the fol­low­ing func­tion

imap :: (Int -> Int) -> [Int] -> [Int]
imap f []     = []
imap f (i:is) = (f i) : imap f is

Now we can rewrite our two func­tions above

inc' :: [Int] -> [Int]
inc' = imap ((+) 1)

square' :: [Int] -> [Int]
square' = imap (\x -> x^2)
  -- Note: we could also do imap (^2) b/c of Haskell's
  -- han­dling of infix func­tions

and

Pre­lude> inc' [1,2,3,4]
[2,3,4,5]
Pre­lude> square' [1,2,3,4]
[1,4,9,16]

Why stop at lists of inte­gers? Let’s gen­er­al­ize map so that we can map over lists of any type.

map :: (a -> b) -> [a] -> [b]
map f []     = []
map f (i:is) = (f i) : map f is

Mapping Over Trees

Let’s now con­sider binary trees, defined as

data Tree a = Leaf | Bin a (Tree a) (Tree a)
   deriv­ing (Eq, Show)

Sim­i­lar to lists, we want to take a tree and have a func­tion that incre­ments each ele­ment and another that squares each ele­ment in the tree.

tinc :: Tree Int -> Tree Int
tinc Leaf         = Leaf
tinc (Bin i l r)  = Bin (i+1) (tinc l) (tinc r)

tsquare :: Tree Int -> Tree Int
tsquare Leaf         = Leaf
tsquare (Bin i l r)  = Bin (i^2) (tsquare l) (tsquare r)

Again, we notice a pat­tern. For both func­tions, we apply some f to the value con­tained at a node and recur­sively apply it to both sub­trees; if we reach a leaf we sim­ply return that leaf. Let’s define map for trees. (Notice this time I skipped right to the gen­er­al­ized ver­sion.)

tmap :: (a -> b) -> Tree a -> Tree b
tmap f Leaf        = Leaf
tmap f (Bin i l r) = Bin (f i) (tmap f l) (tmap f r)

The Functor Typeclass

Let’s com­pare the types tmap and map.

tmap :: (a -> b) -> Tree a -> Tree b
map  :: (a -> b) -> List a -> List b

These look very sim­i­lar. Can we abstract them into one idea? This looks like a job for type­classes!

class Func­tor m where
   fmap :: (a -> b) -> m a -> m b

In this case, we define the type­class Func­tor for which there is one func­tion fmap. We say a type m is an instance of the Func­tor type­class if we define a func­tion fmap for it which given a func­tion f from type a to type b, and a type m of a, we get back a type m of b. For instance, if m is the List type we have

instance Func­tor List where
   fmap f []     = []
   fmap f (i:is) = (f i) : fmap f is

and if m is Tree then we have

instance Func­tor Tree where
   fmap f Leaf     = Leaf
   fmap f (Bin i l r) = Bin (f i) (fmap f l) (fmap f r)

And there a bunch more instances of Func­tor in Haskell!

instance Func­tor Maybe where
   fmap f Noth­ing  = Noth­ing
   fmap f (Just x) = Just (f x)

instance Func­tor IO where
   fmap f io = do
      a <- io
      return (f a)

The Functor Laws

Recall that Haskell type­classes can also have “laws”, which is to say prop­er­ties of the type­class func­tions which should be pre­served for valid instances. For exam­ple, for a type to be an instance of the “Num” type­class in Haskell, + over that type should be asso­cia­tive.

For the Func­tor type­class, there are two laws which hold for valid instances of the type class. The first law states that map­ping the iden­tity func­tion over the func­tor should sim­ply return the orig­i­nal value. In Haskell terms

fmap id = id

We can see eas­ily see this is true for List and Tree.

Pre­lude> fmap id [1,2,3,4]
[1,2,3,4]
Pre­lude> fmap id (Bin 2 (Leaf) (Leaf))
Bin 2 (Leaf) (Leaf)

The sec­ond law is a bit harder. It states that map­ping the com­po­si­tion of two func­tions f and g over a func­tor should be the same as first map­ping g over the func­tor and then map­ping f over it. In Haskell terms

fmap (f . g) = (fmap f) . (fmap g)

So for list this might look like

Pre­lude> (fmap (\x->x^2) . fmap (\x->x+1)) [1,2,3,4]
[4,9,16,25]
Pre­lude> fmap ((\x->x^2) . (\x->x+1)) [1,2,3,4]
[4,9,16,25]

Why Functors?

So, why func­tors? We’ve seen that may com­mon types, like [], Maybe, Trees, and IO can be made valid instances of the Func­tor type­class. This in itself exposes the first real value of type­classes. Sim­i­lar to inter­faces in Java, this allows us to write func­tions over func­tors that will work for any of the com­mon types listed above. This type of abstrac­tion can make code shorter and more under­stand­able.

How­ever, there is a sec­ond, more sub­tle advan­tage of Func­tors. Let’s explain this one by an exam­ple with our Tree type­class from ear­lier. Let’s cre­ate a huge tree.

cre­ate­Tree :: Int -> Tree Int
cre­ate­Tree 0 = Leaf
cre­ate­Tree n = Bin n l r
  where l = cre­ate­Tree (n-1)
        r = cre­ate­Tree (n-1)

tree :: Tree Int
tree = cre­ate­Tree 30

tree has over 1 bil­lion nodes in it!

Sup­pose we now want to add 1 to each ele­ment in the tree, mul­ti­ply each ele­ment in the tree by 2, and then sum the result. We can do that using our tmap from ear­lier.

mapped­Tree1 :: Tree Int
mapped­Tree1 = tmap (+1) (tmap (*2) tree)

sum­Tree :: Tree Int -> Int
sum­Tree Leaf        = 0
sum­Tree (Bin i l r) = i + (sum­Tree l) + (sum­Tree r)

main = do
  print $ sum­Tree mapped­Tree1

This works, but can we do bet­ter? Well, do we have to call tmap twice to cre­ate mapped­Tree1? No, let’s write a new imple­men­ta­tion, mapped­Tree2, using func­tion com­po­si­tion.

mapped­Tree2 :: Tree Int
mapped­Tree2 = tmap ((+1) . (*2)) tree

This is bet­ter! Now we only iter­ate over the nodes once and only allo­cate one new tree instead of two!

This is a great opti­miza­tion, but it did require some inge­nu­ity on our part. What if this could be done auto­mat­i­cally? Look­ing more closely at mapped­Tree1 and mapped­Tree2, we see that this trans­for­ma­tion is really just the com­po­si­tion rule for func­tors. So indeed, for func­tors, our com­piler can auto­mat­i­cally per­form this opti­miza­tion, regard­less of which spe­cific func­tor instance (trees, lists, options, etc.).

This exam­ple exposes the sec­ond advan­tage of func­tors. Func­tors allow us to auto­mat­i­cally per­form rewrit­ing due to the com­po­si­tion rule to reduce our cals to fmap.

Exercises for Home