Moving from Multiparameter Type Classes and Functional Dependencies to Type Families in Haskell

In my first post on the bitx-bitcoin library I mentioned how I solved one particular problem with the help of MultiParamTypeClasses and FunctionalDependencies. Well, as it just so turns out, apparently that wasn’t necessarily the best way to do it.

Recap: What was I trying to solve

Imagine the following polymorphic function:

foo :: (A a, B b) => a -> b
foo = ...

That is, a is an instance of the typeclass A, and b is an instance of B. Notice, in particular, that the two types are essentially completely free to be anything. The Haskell compiler’s type inference algorithm will essentially need to be told the types of each side in order to type-infer this function.

Now imagine also, that in an effort to make the compiler do much of the work for us, and given a whole bunch of instances of A and B, we wish to leverage Haskell’s type-inference. That is, we have something like this

class B b
class A a

data B1 = B1
data B2 = B2
data C = C
data A1 = A1
data A2 = A2

instance B B1
instance B B2

instance A A1
instance A A2

foo :: (A a, B b) => a -> b
foo = undefined

bar :: A a => C -> a
bar = undefined

helper :: B b => C -> b
helper = foo . bar

f :: C -> B1
f = helper

g :: C -> B2
g = helper

main = undefined

We also know that foo should map A1 to B1 and A2 to B2.

Now, we would like it if GHC would figure out the different types of foo in f and g for us, but reasonably we can’t really expect it to do this job. That is, we want to be able to use helper everywhere and just have GHC figure out the rest for us.

Could not deduce (A a0) arising from a use of ‘foo’
from the context (B b)
  bound by the type signature for helper :: B b => C -> b
  at Ex.hs:22:11-23
The type variable ‘a0’ is ambiguous
Note: there are several potential instances:
  instance A A1 -- Defined at Ex.hs:13:10
  instance A A2 -- Defined at Ex.hs:14:10
In the first argument of ‘(.)’, namely ‘foo’
In the expression: foo . bar
In an equation for ‘helper’: helper = foo . bar

We shouldn’t be surprised though. We are making an assumption which we are not explicitly telling GHC. We need to somehow tell GHC that not only does foo map As to Bs, it in fact maps A1 to B1 and A2 to B2.

Let’s remember that this exmaple is not as contrived as it looks. In my case, C is Bytestring, bar is Data.Aeson.decode, and A is Data.Aeson.fromJson.

The MPTC and FD solution

Basically, we ultimately want a “function” on types which tells GHC this relationship. To establish such a relationship we use MultiParamTypeClasses, and to establish it as a one-to-one function we use FunctionalDependencies:

{-# LANGUAGE MultiParamTypeClasses, FunctionalDependencies #-}

class B b
class A a

data B1 = B1
data B2 = B2
data C = C
data A1 = A1
data A2 = A2

instance B B1
instance B B2

instance A A1
instance A A2

class (A a, B b) => Relationship a b | b -> a
instance Relationship A1 B1
instance Relationship A2 B2

foo :: Relationship a b => a -> b
foo = undefined

bar :: A a => C -> a
bar = undefined

helper :: (Relationship a b, B b) => C -> b 
helper = foo . bar

f :: C -> B1
f = helper

g :: C -> B2
g = helper

main = undefined

Now the code compiles without a hitch. We have established the mapping from A1 to B1 and A2 to B2, and annotated a few functions with this relationship (why do we need to do to explicitly annotate helper? I’m not quite sure, actually).

So far this shouldn’t be news.

Well, apparently although MultiParamTypeClasses are considered pretty benign (and many Haskell beginners, myself included, are surprised when they first find out that such simple and obvious functionality requires an extension), FunctionalDependencies are apparently not viewed so lightly.

The alternative to this and many other tricks is the TypeFamilies extension. TypeFamilies enable quite a few tricks in Haskell, almost none of which I understand yet, but luckily using them to replace FunctionalDependencies is an almost mechanical procedure.

Type Families

Among one of the many things which TypeFamilies allow us to encode are what are called “associated types.” That is, in our example, we wish to “associate” A1 with B1 and A2 with B2. Type families in general allow us to encode a whole lot more concepts besides, and I have yet to grok them all, but in this instance they are relatively easy to use.

{-# LANGUAGE TypeFamilies, FlexibleContexts #-}

class B b
class A a

data B1 = B1
data B2 = B2
data C = C
data A1 = A1
data A2 = A2

instance B B1
instance B B2

instance A A1
instance A A2

class (A (T b), B b) => Relationship b where
  type T b :: *

instance Relationship B1 where
  type T B1 = A1

instance Relationship B2 where
  type T B2 = A2

foo :: (B b) => (T b) -> b
foo = undefined

bar :: A a => C -> a
bar = undefined

helper :: Relationship b => C -> b
helper = foo . bar

f :: C -> B1
f = helper

g :: C -> B2
g = helper

main = undefined

Note that we need FlexibleContexts because the context for our (admittedly badly named) typeclass Relationship consists of things more complicated than just C v where v is a simple type variable (here we have the context A (T b)).

Note that each instance of Relationship has a(n associated) type synonym T b (the kind annotation :: * is not strictly necessary).

In general this:

class (K b) => J a b | a -> b where
  foo :: b -> a

instance J Ai Bi where
  foo = bar

can be written as

class K (T a) => J a where
  type T a
  foo :: T a -> a

instance J Ai where
  type T Ai = Bi
  foo = bar

We have replaced every occurrence of the dependent part b with T a, and have provided a type synonym as part of the declaration of the typeclass (a syntax feature which wouldn’t be allowed without this extension).

Conclusion

Essentially, the reason why GHC has so many type extensions is because the Hindley–Milner type system used in standard Haskell, while being easy to type infer, is not powerful enough to express all valid programs in System F. Ideally, each new extension allows Haskell to accept a larger set of reasonable/valid programs while rejecting invalid programs. The downside theoretically is that sometimes type-inference breaks (and so the programmer has to explicitly provide some type annotations) and sometimes some “dangerous” extensions allow one to write invalid programs, or can potentially cause GHC’s type inference to fail to terminate.

While this is all fascinating stuff, all of these extensions, their semantics, and their uses can get overwhelming. The consensus seems to be that TypeFamilies is a safe, well-thought-out, and benign extension, and that it is apparently also quite “easy” (I still have to get through that wall of text on the Haskell Wiki page). Certainly, although I do not yet understand the full scope of TypeFamilies,  I can understand the need to replace one apparently cobbled-together extension like FunctionalDependencies with a more powerful and stable abstraction.

As a final thought, wouldn’t it be swell if the Haskell ecosystem provided us with tools which could easily perform the refactoring above without hassle? As is, we can barely perform operations such as renaming symbols across modules. For a language whose semantics and type-safety promise (and deliver on) joyous manual refactoring, it seems rather strange that there are essentially zero usable/user-friendly/maintained tools for performing even the simplest of refactors.

Thanks a lot to the fine folks over at #haskell for teaching me the simple transformation above.

Advertisements
This entry was posted in Haskell, Programming and tagged , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s