Typeclasses

Let’s cal­cu­late 1 + 1 with inte­gers and with floats. First, in Java.

Sys­tem.out.println(1 + 1);     // 2
Sys­tem.out.println(1.0 + 1.0); // 2.0

Now, in OCaml.

print_int (1 + 1)        (* 2 *)
print_float (1.0 +. 1.0) (* 2. *)

Finally, in Haskell.

print (1 + 1)     -- 2
print (1.0 + 1.0) -- 2.0

Before get­ting into the specifics, what are the sim­i­lar­i­ties and dif­fer­ences between how we do this in Java and OCaml? Is Haskell more like Java or OCaml?

Ad-hoc Polymorphism

Notice that in Java the + oper­a­tor is defined for both inte­gers and floats. This is oper­a­tor over­load­ing or ad-hoc poly­mor­phism, where + adopts dif­fer­ent def­i­n­i­tions depend­ing on the types of the argu­ments it’s applied to.

In con­trast, OCaml does not sup­port this ad-hoc poly­mor­phism. The + oper­a­tor has one type and one type only, int -> int -> int. You can­not use it with floats, you have to use a dif­fer­ent func­tion, the +. oper­a­tor.

Haskell is a dif­fer­ent beast. The sit­u­a­tion looks more sim­i­lar to Java, where + can be used for both inte­gers and floats. Indeed, Haskell does sup­port ad-hoc poly­mor­phism, but it has a dif­fer­ent mech­a­nism than the usual oper­a­tor over­load­ing present in Java. This is evi­dent from its type sig­na­ture.You can get the type of + by ask­ing GHCi with :t (+).

(+) :: Num a => a -> a -> a

Here, Num is a type­class, a con­cept akin to Java’s inter­faces. You can declare any type an instance of the Num type­class, so long as you imple­ment cer­tain func­tions, includ­ing a def­i­n­i­tion for +.

Creating an Instance

Let’s take another look at the exam­ple at the start. In OCaml we had to use dif­fer­ent print func­tions depend­ing on the argu­ment. This is no sur­prise. The way you print an inte­ger is dif­fer­ent than how you would a float or string, so this requires dif­fer­ent func­tions with dif­fer­ent names in a lan­guage with­out ad-hoc poly­mor­phism.You could define a generic print func­tion with type ‘a -> unit, but per­haps not every value has a sen­si­ble print­able rep­re­sen­ta­tion. What would you do in those cases?

How­ever in Haskell we could use print for all of them. Why? The type sig­na­ture tells all.

print :: Show a => a -> IO ()

Any type that is an instance of the Show type­class can be printed. Any type that is an instance of Show has to imple­ment a show func­tion that out­puts a string rep­re­sen­ta­tion of that value. Here is a good demon­stra­tion of this.

data Word = Hello | World

instance Show Word where
  show Hello = "Hola"
  show World = "Mundo"

We’re declar­ing a new type Word and telling Haskell that the string rep­re­sen­ta­tion of each value is the Span­ish trans­la­tion. So, if you print Hello you’ll get out Hola.

So what?

This may seem like a con­ve­nience of lit­tle con­se­quence, but it’s a big deal. Con­sider the fol­low­ing func­tion in OCaml.

let rec more_than y xs =
  match xs with
  | [] -> []
  | x :: xt ->
      if x > y then x :: (more_than y xt)
      else more_than y xt

more_than 10 [5; 18; 80; -1; 7; 13]
(* = [18; 80; 13] *)

Sup­pose I want to apply this func­tion to a bunch of foods and get back only the one’s I like bet­ter than some par­tic­u­lar food.

type food = Apple | Banana | Can­taloupe | Dumpling |
            Egg­plant | Fudge | Grape­fruit

more_than Dumpling [Grape­fruit; Fudge; Can­taloupe; Egg­plant]
(* = [Egg­plant; Fudge; Grape­fruit] *)

Sorry, OCaml will enforce its own derived notion of order. It will assume you want sort based on the order given in the data def­i­n­i­tion. Not what I want.

Let’s try some­thing sim­i­lar in Haskell. First, the equiv­a­lent piece of code.I’ve writ­ten this pro­gram to mir­ror the OCaml ver­sion, but you could write it much more com­pactly using list com­pre­hen­sions.

more­Than :: (Ord a) => a -> [a] -> [a]
more­Than y [] = []
more­Than y (x : xt) =
  if x > y then x : (more­Than y xt)
  else more­Than y xt

Notice that more­Than doesn’t care about the types of its input except for the fact that it defines some notion of order by being an instance of Ord. This allows a level of abstrac­tion and flex­i­bil­ity unavail­able to the OCaml pro­gram­mer.An OCaml’er would solve this dilemma by forc­ing the user to pro­vide more­Than with a com­para­tor func­tion. Imag­ine a more com­pli­cated func­tion where you may have dif­fer­ent types with pos­si­bly many dif­fer­ent con­straints. This solu­tion doesn’t scale. Now with Haskell I can define the notion of order on food val­ues. I declare foods an instance of Ord with my pref­er­ences instead of the lex­i­co­graphic order.

import Data.List (elem­Index)

data Food = Apple | Banana | Can­taloupe | Dumpling |
            Egg­plant | Fudge | Grape­fruit
            deriv­ing (Eq, Show)

instance Ord Food where
  com­pare x y = com­pare kx ky
    where prefs     = [Grape­fruit, Can­taloupe, Apple,
                       Banana, Dumpling, Fudge, Egg­plant]
          (Just kx) = elem­Index x prefs
          (Just ky) = elem­Index y prefs

Haskell now knows the order I like my food, so I can find only food I like bet­ter than apples.

more­Than Apple [Grape­fruit, Fudge, Can­taloupe, Egg­plant]
-- = [Fudge, Egg­plant]

Haskell knows how to print our food items even though we didn’t give it an imple­men­ta­tion of show. What gives? Well, I snuck in deriv­ing (Eq, Show) that can auto­mat­i­cally gen­er­ate rea­son­able instances for some type­classes, includ­ing Show.Any instance of Ord must also be an instance of Eq, so that’s why that’s there. You can also derive Ord and it will use the same order­ing as OCaml.

The Expression Problem

Sup­pose we have a lit­tle lan­guageThis par­tic­u­lar lan­guage is some­times referred to as Hut­ton’s razor. for inte­ger addi­tion.

data Expr = Val Int
          | Add Expr Expr

We can eas­ily define some oper­a­tions over this data.

show :: Expr -> String
show (Val x) = show x
show (Add l r) = (show l) ++ " + " ++ (show r)

eval :: Expr -> Int
eval (Val x) = x
eval (Add l r) = (eval l) + (eval r)

Here there are sev­eral data vari­ants includ­ing Val and Add. There could also be many dif­fer­ent oper­a­tions, here we have show and eval. The expres­sion prob­lem is to find a way to make both the data and oper­a­tions exten­si­ble. When we say exten­si­ble, we want three main cri­te­ria.

  1. Mod­u­lar. The exten­sion should live out­side of what it’s extend­ing in a sep­a­rate mod­ule. We should not have to mod­ify the exist­ing code to cre­ate an exten­sion.

  2. Sep­a­rate com­pi­la­tion. If a user has a com­piled ver­sion of our orig­i­nal mod­ule, we should be able to send them a sep­a­rately com­piled exten­sion mod­ule and have every­thing work cor­rectly.

  3. Type safety. The exten­sion should not sub­vert the sta­tic type sys­tem.

In a func­tional lan­guage, we can eas­ily get oper­a­tion exten­si­bil­ity. We sim­ply intro­duce a new mod­ule that imports the data def­i­n­i­tion from above and imple­ments a new func­tion over the data.

swap :: Expr -> Expr
swap (Val x) = Val x
swap (Add l r) = Add r l

How­ever, if we wanted to add a new data vari­ant we’re stuck. Now we’d have to mod­ify all the exist­ing mod­ules with oper­a­tions in them to sup­port this new vari­ant. No good.

What about OOP? We can define the lan­guage once more, but this time in Java.

pub­lic abstract class Expr {
  pub­lic abstract String to­String();
  pub­lic abstract Inte­ger eval();
}

pub­lic class Val extends Expr {
  pub­lic Inte­ger data;

  pub­lic String to­String() {
    return data.to­String();
  }

  pub­lic Inte­ger eval() {
    return data;
  }
}

pub­lic class Add extends Expr {
  pub­lic Expr left, right;

  pub­lic String to­String() {
    return left.to­String() + " + " + right.to­String();
  }

  pub­lic Inte­ger eval() {
    return left.eval() + right.eval();
  }
}

In an OO lan­guage, we can eas­ily get data exten­si­bil­ity. We sim­ply intro­duce a new mod­ule that imple­ments a new class that extends Expr and imple­ments all the required meth­ods.

pub­lic class Neg extends Expr {
  pub­lic Expr data;

  pub­lic String to­String() {
    return "-(" + data.to­String() + ")";
  }

  pub­lic Inte­ger eval() {
    return -data.eval();
  }
}

How­ever, we can­not get oper­a­tion exten­si­bil­ity. This would neces­si­tate mod­i­fy­ing the abstract class and in turn all the classes extend­ing it. No good.

Extensibility with Typeclasses

We will use type­classes to get exten­si­bil­ity. Con­sider this mod­i­fi­ca­tion to the data def­i­n­i­tion.

{-# LAN­GUAGE Datatype­Con­texts #-}
data Val = Val Int
data Add l r = Add l r

class Expr x
instance Expr Val
instance (Expr l, Expr r) => Expr (Add l r)

There’s some strange stuff going on here. Instead of hav­ing a sin­gle uni­fied data type with sev­eral vari­ants, we have mul­ti­ple data types, one for each vari­ant, and a sin­gle uni­fied type­class. As expected, each of the types is an instance of the type­class. Notice the sec­ond instance is say­ing that Add l r is an instance of Expr so long as l and r are too.

Now we’re going to see oper­a­tion exten­si­bil­ity.

class Expr x => Eval x where
  eval :: x -> Int

instance Eval Val where
  eval (Val x) = x

instance (Eval l, Eval r) => Eval (Add l r) where
  eval (Add l r) = (eval l) + (eval r)

Here too our func­tion is going to be encoded as a type­class. For a type to be an instance of Eval it must already be an Expr. This makes sense. This type­class will have a con­crete func­tion that must be imple­mented. Every data type should should then become an instance.

Now, what does a data exten­sion look like?

data Expr x => Neg x = Neg x
instance Expr x => Expr (Neg x)

instance Eval x => Eval (Neg x) where
  eval (Neg x) = -(eval x)

The first two lines define a new data type like we did before, but for Neg. As with our pre­vi­ous con­struc­tors, we make it an instance of Expr. Now in order to imple­ment any of the func­tions over our data, we need only make it an instance of the cor­re­spond­ing type­class. Here we’ve made it an instance of Eval and hence we can call the eval func­tion on any Neg type.

While slightly exotic, this cer­tainly does solve the expres­sion prob­lem.

Additional Reading