So, I have finally decided to start a blog.

This will serve primarily to publish my thoughts on topics which interest me, including Mathematics, Linguistics, and Computer programming.

I will post primarily in English, though I will occasionally do some posts in Sesotho.

I hope that some may find this useful.

Tebello M. Thejane.

Posted on by motset | Leave a comment

Moving from Multiparameter Type Classes and Functional Dependencies to Type Families in Haskell

In my first post on the bitx-bitcoin library I mentioned how I solved one particular problem with the help of MultiParamTypeClasses and FunctionalDependencies. Well, as it just so turns out, apparently that wasn’t necessarily the best way to do it.

Recap: What was I trying to solve

Imagine the following polymorphic function:

foo :: (A a, B b) => a -> b
foo = ...

That is, a is an instance of the typeclass A, and b is an instance of B. Notice, in particular, that the two types are essentially completely free to be anything. The Haskell compiler’s type inference algorithm will essentially need to be told the types of each side in order to type-infer this function.

Now imagine also, that in an effort to make the compiler do much of the work for us, and given a whole bunch of instances of A and B, we wish to leverage Haskell’s type-inference. That is, we have something like this

class B b
class A a

data B1 = B1
data B2 = B2
data C = C
data A1 = A1
data A2 = A2

instance B B1
instance B B2

instance A A1
instance A A2

foo :: (A a, B b) => a -> b
foo = undefined

bar :: A a => C -> a
bar = undefined

helper :: B b => C -> b
helper = foo . bar

f :: C -> B1
f = helper

g :: C -> B2
g = helper

main = undefined

We also know that foo should map A1 to B1 and A2 to B2.

Now, we would like it if GHC would figure out the different types of foo in f and g for us, but reasonably we can’t really expect it to do this job. That is, we want to be able to use helper everywhere and just have GHC figure out the rest for us.

Could not deduce (A a0) arising from a use of ‘foo’
from the context (B b)
  bound by the type signature for helper :: B b => C -> b
  at Ex.hs:22:11-23
The type variable ‘a0’ is ambiguous
Note: there are several potential instances:
  instance A A1 -- Defined at Ex.hs:13:10
  instance A A2 -- Defined at Ex.hs:14:10
In the first argument of ‘(.)’, namely ‘foo’
In the expression: foo . bar
In an equation for ‘helper’: helper = foo . bar

We shouldn’t be surprised though. We are making an assumption which we are not explicitly telling GHC. We need to somehow tell GHC that not only does foo map As to Bs, it in fact maps A1 to B1 and A2 to B2.

Let’s remember that this exmaple is not as contrived as it looks. In my case, C is Bytestring, bar is Data.Aeson.decode, and A is Data.Aeson.fromJson.

The MPTC and FD solution

Basically, we ultimately want a “function” on types which tells GHC this relationship. To establish such a relationship we use MultiParamTypeClasses, and to establish it as a one-to-one function we use FunctionalDependencies:

{-# LANGUAGE MultiParamTypeClasses, FunctionalDependencies #-}

class B b
class A a

data B1 = B1
data B2 = B2
data C = C
data A1 = A1
data A2 = A2

instance B B1
instance B B2

instance A A1
instance A A2

class (A a, B b) => Relationship a b | b -> a
instance Relationship A1 B1
instance Relationship A2 B2

foo :: Relationship a b => a -> b
foo = undefined

bar :: A a => C -> a
bar = undefined

helper :: (Relationship a b, B b) => C -> b 
helper = foo . bar

f :: C -> B1
f = helper

g :: C -> B2
g = helper

main = undefined

Now the code compiles without a hitch. We have established the mapping from A1 to B1 and A2 to B2, and annotated a few functions with this relationship (why do we need to do to explicitly annotate helper? I’m not quite sure, actually).

So far this shouldn’t be news.

Well, apparently although MultiParamTypeClasses are considered pretty benign (and many Haskell beginners, myself included, are surprised when they first find out that such simple and obvious functionality requires an extension), FunctionalDependencies are apparently not viewed so lightly.

The alternative to this and many other tricks is the TypeFamilies extension. TypeFamilies enable quite a few tricks in Haskell, almost none of which I understand yet, but luckily using them to replace FunctionalDependencies is an almost mechanical procedure.

Type Families

Among one of the many things which TypeFamilies allow us to encode are what are called “associated types.” That is, in our example, we wish to “associate” A1 with B1 and A2 with B2. Type families in general allow us to encode a whole lot more concepts besides, and I have yet to grok them all, but in this instance they are relatively easy to use.

{-# LANGUAGE TypeFamilies, FlexibleContexts #-}

class B b
class A a

data B1 = B1
data B2 = B2
data C = C
data A1 = A1
data A2 = A2

instance B B1
instance B B2

instance A A1
instance A A2

class (A (T b), B b) => Relationship b where
  type T b :: *

instance Relationship B1 where
  type T B1 = A1

instance Relationship B2 where
  type T B2 = A2

foo :: (B b) => (T b) -> b
foo = undefined

bar :: A a => C -> a
bar = undefined

helper :: Relationship b => C -> b
helper = foo . bar

f :: C -> B1
f = helper

g :: C -> B2
g = helper

main = undefined

Note that we need FlexibleContexts because the context for our (admittedly badly named) typeclass Relationship consists of things more complicated than just C v where v is a simple type variable (here we have the context A (T b)).

Note that each instance of Relationship has a(n associated) type synonym T b (the kind annotation :: * is not strictly necessary).

In general this:

class (K b) => J a b | a -> b where
  foo :: b -> a

instance J Ai Bi where
  foo = bar

can be written as

class K (T a) => J a where
  type T a
  foo :: T a -> a

instance J Ai where
  type T Ai = Bi
  foo = bar

We have replaced every occurrence of the dependent part b with T a, and have provided a type synonym as part of the declaration of the typeclass (a syntax feature which wouldn’t be allowed without this extension).


Essentially, the reason why GHC has so many type extensions is because the Hindley–Milner type system used in standard Haskell, while being easy to type infer, is not powerful enough to express all valid programs in System F. Ideally, each new extension allows Haskell to accept a larger set of reasonable/valid programs while rejecting invalid programs. The downside theoretically is that sometimes type-inference breaks (and so the programmer has to explicitly provide some type annotations) and sometimes some “dangerous” extensions allow one to write invalid programs, or can potentially cause GHC’s type inference to fail to terminate.

While this is all fascinating stuff, all of these extensions, their semantics, and their uses can get overwhelming. The consensus seems to be that TypeFamilies is a safe, well-thought-out, and benign extension, and that it is apparently also quite “easy” (I still have to get through that wall of text on the Haskell Wiki page). Certainly, although I do not yet understand the full scope of TypeFamilies,  I can understand the need to replace one apparently cobbled-together extension like FunctionalDependencies with a more powerful and stable abstraction.

As a final thought, wouldn’t it be swell if the Haskell ecosystem provided us with tools which could easily perform the refactoring above without hassle? As is, we can barely perform operations such as renaming symbols across modules. For a language whose semantics and type-safety promise (and deliver on) joyous manual refactoring, it seems rather strange that there are essentially zero usable/user-friendly/maintained tools for performing even the simplest of refactors.

Thanks a lot to the fine folks over at #haskell for teaching me the simple transformation above.

Posted in Haskell, Programming | Tagged , , , , , , | Leave a comment

First steps with Scala — let’s do some Euler problems!

So, I need to quickly get started in Scala, stat! Yes, I need to, although that doesn’t mean that I wouldn’t want to either, the need has just made me want it a little earlier than I had planned. Never mind the reasons why, but I thought this would make an interesting blog post.

A great way to torture oneself into understanding a language is to attempt a few Project Euler problems. So let’s get on with it!

Starting from scratch

I will not delve into explaining Scala in this blog post. All I will say is that I view it as a way to have ones cake and eat it — you get much of the expressivity of Haskell coupled with the JVM and its impressive library base. You can also cheat and loosen up the straitjackets Haskell needs you to wear, which is great but only after you have learned why and when to artificially force yourself back into them when using more permissive languages.

As it turns out, my laptop is pretty much clean of anything Scala-related (besides the JDK), so I need to start from scratch.

What do I need

I assume that Typesafe Activator is the easy choice, but since at the moment I am not worried about Play and Akka, I’m hoping I can download a little bit less that half a gigabyte by instead installing Scala and SBT manually.

Scala is pretty straight-forward to install on Ubuntu:

sudo apt-get install scala

Unsurprisingly (since I’m relying on Debian’s packaging system rather than downloading the software directly), this installs a slightly older version of Scala (2.9.2) than the current bleeding-edge version (2.11.7).

As with most other languages, in order to preserve my sanity I probably need a package manager, but the package manage does not come with the language itself, so I need to install SBT. SBT is not-so-straightforward, though, since the package does not live on the default repos which come pre-configured on Ubuntu:

echo "deb /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-get update
sudo apt-get install sbt

Next, just for fun, we need to run sbt for the first time so that it can bootstrap itself:


This downloads a tonne of stuff from the primary maven repo ( as well as from Typesafe. So much for trying to minimise how much data I use on my expensive mobile internet…

My first Scala program!

Let’s solve a few Poject Euler problems.

I have a bit of experience with Project Euler, having solved a few dozen problems in Haskell a while back. One feature of Haskell I grew to love was its lazy-by-default semantics coupled with its lists; two features which, when used together, allow one to simply say something crazy like “Create all the prime numbers in the world, create all the pentagonal numbers, add 1 to all the pentagonal numbers, remove these from the list of primes, and then find the first number in the remaining list which is greater than 1000 000” in simple expressive code, and then wait a few seconds (or dozens of minutes) while your Haskell program figures out the answer.

I have written a few Haskell programs for Project Euler where I had to sit and think for a few minutes about how the hell this thing gave me the right answer as soon as I could get it to typecheck and compile.

I have a feeling things will be a lot different in Scala.

Euler 1

Apparently SBT does not have a subcommand for creating a new skeleton project. That’s not necessarily a bad thing though. So we have to create a quick scala source file in its own directory.

object Prob1 {
  def main (args : Array[String]) {
    println("Um, I'm not sure what the answer is...")

It might be prudential to also create an .sbt file, although it is strictly not necessary to run such a simple program.

lazy val root = (project in file(".")).
      name := "Problem 1",
      version := "1.0"

Nope. I have no idea what half of that means, either.

Now, let’s run our little project. Using sbt run seems to display a bunch of crud before and after running our code, so let’s see if we can run the project using Scala without the help of SBT:

$ scalac prob1.scala
error: error while loading CharSequence, class file '/usr/lib/jvm/java-8-oracle/jre/lib/rt.jar(java/lang/CharSequence.class)' is broken
(bad constant pool tag 18 at byte 10) 
one error found

Oops. Remember what I said above about the fact that apt had installed an old version of Scala? Well, as it just so happens, the version of Scala I installed was too old for the version of SBT I installed. Okay, so I’m gonna have to burn some more moola by uninstalling SBT and Scala and installing Typesafe Activator instead.

Incidentally, this reminds me of how Haskell’s Cabal tool would helpfully suggest that you install the latest version every time it ran, and then when you did your computer would explode because the latest version of Cabal was only compatible with a not-yet-released version of GHC. It’s a completely unrelated issue, but it just reminds me of that tale from the trenches…

Euler 1, second attempt

I promise I’ll be writing some code soon!

Okay, after setting up activator lite, we need to create a new skeleton project (yay!):

$ activator new

I chose the minimal Scala option. The we change into the generated directory and run the generated project, which incidentally bootstraps Activator and downloads who-knows-how-many gigabytes of once-off dependencies:

$ activator run
[info] Loading project definition from /home/thejanet/dev/scala/euler/prob1/project
[info] Set current project to prob1 (in build file:/home/thejanet/dev/scala/euler/prob1/)
[info] Running com.example.Hello 
Hello, world!
[success] Total time: 0 s, completed 08 Jul 2015 2:55:44 PM

Let’s take a look at the shell application that was generated.

package com.example                                                    
object Hello {                                               
  def main(args: Array[String]): Unit = {                         
    println("Hello, world!") 

Looks okay, but it would have been nice if Activator had not assumed the package name. I have no idea what “: Unit =” means, in case you’re also wondering the same thing…

Hmm… So we run into another minor annoyance: neither Vim nor Emacs (Spacemacs, to be exact) seem to recognise Scala source and enable syntax highlighting by default. Luckily, I happen to have IntelliJ IDEA lying around, so I pulled out the Big Boy (because that’s what everyone uses) and installed the canonical Scala plugin. Then it gobbled up all my RAM and made the rest of the computer (including the command line) unresponsive. This evening is gonna be even more interesting than I thought…

Show me tha codez plz

Euler #1 is pretty basic, and does not require much in the way of fancy features, so I will first attempt to do it like a naïve Java developer getting acquainted with Scala syntax (which is exactly what I am). We simply need to loop from 1 to 1000, and keep an accumulator to which we add all multiples of 3 or 5.

After much faffing about (and IntelliJ warming up the laptop nicely while trying to help me out), I came up with this:

package com.example

object Hello {
  def main(args: Array[String]) {
    var result = 0

    for (x <- 1 to 999){
      if((x % 3 == 0) || (x % 5 == 0)) result += x


This produces the correct answer, but it doesn’t feel right. We learn new languages to learn new ways of thinking, not just new syntax, right? So let me try to do this The Functional Way™.

In a purely functional language like Haskell you are almost forced to solve this problem properly, since there are no mutable variables (so no accumulators) and no for-loops (at least, you need to learn a whole lot of the language before you get the tools necessary to create mutable variables and simulate for-loops). So we need to figure out a way to

create a list of all natural numbers → filter based on the condition → filter to the required range → sum

I’m not sure if we can create a list of all the natural numbers in Scala, so we will need to create a range instead, but the other steps should be doable for any language which dares call itself a functional language. In particular, the syntax “a to b” is Scala sugar for creating a range (which is ostensibly a Sequence, so I guess it is more like a lazy list in Haskell than a Java array).

Since filter is a member method of AbstractSeq, and Sequence inherits from AbstractSeq, we put our OOP hats on and start inserting full stops to chain actions on our initial list (with IntelliJ’s autocomplete and suggestions helping out):

package com.example

object Hello {
  def main(args: Array[String]) {
    println( (1 to 999).filter(x => (x % 3 == 0) || (x % 5 == 0)).sum )

This is pretty much what one would expect, really. The only things that are new to me are the lambda syntax (=>), and the fact that I didn’t need to put round brackets on the sum method call since they are not mandatory if the method does not have any parameters.

Okay, I lied. Putting round brackets at the end of sum causes Scala to complain.

$ activator run
[info] Loading project definition from /home/thejanet/dev/scala/euler/prob1/project
[info] Set current project to prob1 (in build file:/home/thejanet/dev/scala/euler/prob1/)
[info] Compiling 1 Scala source to /home/thejanet/dev/scala/euler/prob1/target/scala-2.11/classes...
[error] /home/thejanet/dev/scala/euler/prob1/src/main/scala/com/example/Hello.scala:5: not enough arguments for method sum: (implicit num: Numeric[B])B.
[error] Unspecified value parameter num.
[error]     println((1 to 999).filter(x => (x % 3 == 0) || (x % 5 == 0)).sum())
[error]                                                                     ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 3 s, completed 08 Jul 2015 4:04:16 PM

Unspecified value parameter num“?!? Implicit parameters?!? Spooky action at a distance?!? I’d rather not think about that right now.

For completeness sake, here is my solution to the problem in Haskell:

import Data.List

mult35 = union (takeWhile (< 1000) $ iterate (+3) 0)
 (takeWhile (< 1000) $ iterate (+5) 0)

answer = sum . takeWhile (<= 1000) $ mult35 
main = do 
    print answer

Note that this is hardly a canonical answer. Instead of creating a list of natural numbers and then filtering them out, I am actually creating two lists — all multiples of 5 and all multiples of 3 — and then merging the two and truncating. I’m not sure why, but I guess that two years ago I thought it would be a good idea to write it that way (I’m sure the cost of merging two lists while eliminating duplicates — a quadratic time complexity operation — is a lot higher than linearly filtering a list once)?

Here’s a more canonical answer in Haskell:

answer = sum
   . filter (\x -> (x `mod` 5 == 0) || (x `mod` 3 == 0))
   $ [1..999]

main =
    print answer

Note how the code looks almost exactly like the Scala version, except that the operation order is reversed (Haskell uses function composition, which goes from right to left, while OOP-style method invocation goes from left to right). Scala seems like a nice compromise between nouns and verbs.

Euler 2

Euler problem 2 is just interesting enough as to not be trivially doable in a functional programming language (most of Project Euler is not trivially doable in any language — problem 1 was the bait, the rest of the site is the switch). The easy way to do it in Haskell is to generate a list of all the Fibonacci numbers, and then filter on size and parity, and finally sum.

The tricky bit is generating the list of Fibs. In Haskell lists can be coinductive; that is, the definition for the list ξ can refer to ξ itself on the right hand side of the definition, and this is due to fact that lists are lazy by default in Haskell. I still find this behaviour magical to this day.

I’ll have to assume for the moment that Scala can’t do coinductive lists, at least not trivially, based on zero evidence whatsoever.

So, how do you do the Fibonacci numbers when you can’t rely on Haskell magic? Well, the most obvious solution is to just use the standard definition of the Fibonacci numbers, define a function based on it, then just iterate through the function values (or map the function over the list of natural numbers, taking advantage of the fact that a function mapped over the naturals is isomorphic to a natural-indexed array/list). The problem here is that calculating the Fibonacci function for an arbitrary natural number is expensive.

So, let’s over-engineer this a little bit, before even trying to see whether the running time of a naive implementation would be acceptable. Let’s see if we can do memoisation in Scala and take advantage of how easy it is to mutate data (as compared to Haskell).

Well, how does one do memoisation in Scala? A quick Google gave me this: Okay, so maybe lets try to naive way first, because I have no idea what Memo[A,B](f: A => B) extends (A => B) means at this moment 😕.

Using this page as a guide on how to do switch-like pattern matching in Scala, I come up with the following:

package com.example

object Hello {
  def main(args: Array[String]) {
        .takeWhile(_ <= 4000000)
        .filter(_ % 2 == 0)

  def fib(i : Int): Int = i match {
    case 0 => 1
    case 1 => 2
    case _ => fib (i - 1) + fib (i - 2)

Here are a few interesting things to note about this code:

  • I used Stream.from to create a lazy infinite list. So Scala does have them after all! In fact, while researching Stream I came upon the following little gem: Not only is coinduction so possible in Scala (and easy to express), Scala Stream also comes with memoisation built-in! Damn!
  • I’m using anonymous lambda parameters (f(_) instead of x => f(x)). Now we’re cooking with fire!
  • This is not as slow as I thought it would be. It only takes 6 seconds on this laptop (with IDEA the juggernaut slowly eating what little RAM and CPU I still have left). The equivalent Haskell code (using coinduction) takes 4 seconds.

So, luckily this problem was just simple enough that we didn’t need memoisation or BigInt to do it. The rest of the Euler problems dealing with Fibonacci numbers are not so easy.

For completeness, here is the Haskell code I wrote to solve this problem 2 years ago:

fib :: Num a => a -> a -> [a]
fib n1 n2 = n1:n2:zipWith (+) (fib n1 n2) (tail (fib n1 n2))

answer = sum . filter even . takeWhile (<= 4000000) $ fib 1 2
main = do
    print answer

My neophyte coding choices are even more bizarre in this code than they were in my original Haskell solution to problem 1. To compare apples to apples, let’s rewrite this to match the Scala code above:

fib 1 = 1
fib 2 = 2
fib n = fib (n - 1) + fib (n - 2)

answer = sum 
    . filter even 
    . takeWhile (<= 4000000) 
    . map fib 
    $ [1..]

main =
    print answer

Apart from the fact that Haskell has a built-in even function (and the bit about the operations being reversed), the semantics and structure of this code resemble those of the Scala code exactly, modulo syntax. This code also runs 10 times faster than the first incarnation, at 0.4s; machine code ftw.

In conclusion

Scala, at least the basics of it, is not that scary. I must say, of all the popular JVM-based languages out there, so far I seem to like it more than the rest (Clojure has those LISP brackets everywhere; and you could probably write a Hebrew poem in a text file and Groovy would still compile it without complaining, only to explode at run-time).

Stuff I have learned includes:

  • Typesafe Activator is simple and straight-forward. Almost like Haskell’s Stack, although the latter has the annoying “feature” that it tends to ignore dependency files used by Haskell’s default dependency manager. And like Stack it pulls literally terrabytes of data to bootstrap itself when you have an absolutely clean computer.
  • Scala has nice syntactic sugar to create ranges (reminds me of Python), and also has a very powerful API for lazy streams. I think I’m falling in love.
  • Scala’s TraversableLike — which is the ancestor of Scala’s list-like data structures — supports the full gamut of functional higher-order functions. This by default makes great classes of algorithms infinitely more expressible. Why should “Do this to every member of the list” have to first be translated to “Create a method, create a new list, and write a for-loop”?
  • Coupled with the point above, Haskell’s elegance and expressivity seems to translate almost mechanically to Scala. This is a Good Thing®.
  • IntelliJ IDEA will make a perfectly good computer feel like a relic from the noughties.
Posted in Uncategorized | Leave a comment

Revising the BitX haskell bindings

Context: my previous blog post announcing the BitX bindings.

So a week ago I released the first version of the bitx-bitcoin library and lo and behold! I am now releasing a major revision. I decided that it would probably be a good idea to document what has changed so that the dozen-or-so loyal users of my library (most of them are probably mirrors, but whatever) can get some context before they inundate me with complaints about breaking changes impacting their high-availability financial systems.

More importantly, I thought I would like to briefly share all the shiny new improvements introduced in the code, primarily as a result of discussions on this Reddit thread. The purpose here is to learn together, so let’s start.

Tips and tricks

Most of the changes are refactorings with zero impact on functionality. These are basically either idioms I knew about but overlooked due to lack of experience, or brand knew tricks I was not aware of.

Scientific instead of Decimal

One of the major design decisions in the first incarnation of the library was figuring out a numeric data type to use to avoid errors due to rounding in financial values. As a general principle, one should never use floating point for financial arithmetic. I looked around and decided to use the relatively-not-popular Decimal package. I think my Google-fu failed me in this instance.

It turns out that there is a more popular alternative with essentially the same functionality, with the added benefit that it is a dependency of aeson: the scientific package. Due to the magic of typeclasses, the change is simply searching for the word Decimal and replacing it with Scientific. Fewer transitive dependencies is always a good thing, methinks.

Incidentally, this was the first of two API-breaking changes, and thus warranted bumping up the major version number of the package.

Fun with Applicative (and friends)!

Oooohh. The A-word! I was shown an oportunity to use a very common Haskell idiom in the core network code. Basically, the idiom goes something like this:

If you want to try various alternatives, each one using the Maybe monad to signify success or failure, with a final default option should all others fail, then use the Applicative (and Alternative) instances of Maybe.

In particular, this

let a = doSomeStuff
case a of
  Just α  -> return . prepareAnswerA $ α
  Nothing -> do
    let b = doSomeOtherStuff
    case b of
      Just β  -> return . prepareAnswerB $ β
      Nothing -> return defaultAnswer

can be rewritten as this

fromJust $
      prepareAnswerA <$> doSomeStuff
  <|> prepareAnswerB <$> doSomeOtherStuff
  <|> Just defaultAnswer

To understand this a bit better, we need to look at the Applicative instance of Maybe:

instance Applicative Maybe where
  pure = Just

  Just f  <*> m  = fmap f m
  Nothing <*> _m = Nothing

<$> is an alias for fmap, in that f <$> m = fmap f m. fmap comes from the Functor instance:

instance Functor Maybe where
  fmap _ Nothing  = Nothing
  fmap f (Just a) = Just (f a)

which seems like an obvious definition given that Functor is simply the typeclass of things which provide a mapping operation. Putting everything together, we basically have the following:

g <$> Just a  = Just . g $ a
g <$> Nothing = Nothing

So, that seems simple enough. Take a pure function g, and return a Maybe which is either g applied to the contents of a Just, or Nothing. The Alternative instance of Maybe looks like this:

instance Alternative Maybe where
  empty = Nothing

  Nothing <|> r = r
     l    <|> _ = l

Don’t worry about empty. Rather, note that if the left of <|> is Nothing then it will try the right, otherwise it will return the left. That is, a chain of <|>s will try a sequence of Maybes until it finds the first Just, and it will return the first Just (<|> is left associative) or the final element (which might be Nothing if there are no Justs in the sequence).

After we have tried all the alternatives, and given a default response, we then use fromJust to convert the entire thing into a pure value. Note that since in general we might potentially not find a single Just, our program could crash at run time since fromJust is a partial function. This is not an issue in our use of the idiom since we give a default value as the final option.

For completeness, the code in question in the BitX bindings now looks like this:

  :: BitXAesRecordConvert rec aes 
  => Response BL.ByteString -> BitXAPIResponse rec
bitXErrorOrPayload resp = fromJust $
      ErrorResponse . aesToRec <$> Aeson.decode body
  <|> ValidResponse . aesToRec <$> Aeson.decode body
  <|> Just (UnparseableResponse  resp)
      body = NetCon.responseBody resp

Note the apparently repeated code on lines 5 and 6. This is due to the FunctionalDependencies magic I explained in the previous blog post (Haskell will infer the correct — and different — types automatically).


This one was an idiom I did not recognise, although I had read a little about the extension in question. Essentially, the RecordWildcards extensions enables you to use — under certain contexts — record field accessor functions as if they were field names. Recall the following snippet from the previous blog post:

data Ticker_ = Ticker_
  { ticker'timestamp :: TimestampMS
  , ticker'bid :: QuotedDecimal
  , ticker'ask :: QuotedDecimal
  , ticker'last_trade :: QuotedDecimal
  , ticker'rolling_24_hour_volume :: QuotedDecimal
  , ticker'pair :: CcyPair

$(AesTH.deriveFromJSON AesTH.defaultOptions{AesTH.fieldLabelModifier = last . splitOn "'"}

instance BitXAesRecordConvert Ticker Ticker_ where
  aesToRec (Ticker_ ticker''timestamp ticker''bid ticker''ask ticker''lastTrade
    ticker''rolling24HourVolume ticker''pair) =
    [record| {timestamp = tsmsToUTCTime ticker''timestamp,
      bid = qdToDecimal ticker''bid,
      ask = qdToDecimal ticker''ask,
      lastTrade = qdToDecimal ticker''lastTrade,
      rolling24HourVolume = qdToDecimal ticker''rolling24HourVolume,
      pair = ticker''pair} |]

Note, in particular, that, not only do we need to add prefixes to the record field names to get around the records’ problem in Haskell, we also do a little pattern matching in the instance declaration. I thought I had two choices when naming the fields to pattern match on:

  1. Use the same names as in the record declaration, in which case the Haskell compiler will complain about the names shadowing the functions defined in the record declaration (because in Haskell these are accessor functions in global scope, they are not really field names): aesToRec (Ticker_ ticker'timestamp ...) ... timestamp = ticker'timestamp, or
  2. Make the names I use to pattern match different from the field names, in which case the Haskell compiler complains about me not using the “field names” I created in the record declaration (remember, the only real reason I had to name these fields was so that Data.Aeson.TH could do its magic): aesToRec (Ticker_ ticker''timestamp ...) ... timestamp = ticker''timestamp.

Of course, there was a third alternative I didn’t consider at the time — that of actually using the “field” names in the record declaration for what they are, accessor functions: aesToRec tic ... timestamp = ticker'timestamp tic. But the RecordWildCard extension gives us something much better than all three alternatives: we get to pretend, if only for a little bit, that the record “fields” are actually field names rather than accessors.

instance BitXAesRecordConvert Ticker Ticker_ where
  aesToRec (Ticker_ {...}) =
    [record| {timestamp = tsmsToUTCTime ticker'timestamp,
      bid = qdToDecimal ticker'bid,
      ask = qdToDecimal ticker'ask,
      lastTrade = qdToDecimal ticker'lastTrade,
      rolling24HourVolume = qdToDecimal ticker'rolling24HourVolume,
      pair = ticker'pair} |]

This is not an earth-shattering refactor, but it’s nice to use when you can find an opportunity. As a side-effect, I no longer had to suppress the Unused bindings warnings the compiler was producing due to the way I had done this initially.

Other changes

Other changes were more specific to this package rather than Haskell idioms.

Removing the Auth module

I had to remove Network.Bitcoin.BitX.Private.Auth. It never felt quite right including it in the first place (it sounds more like something that should be done manually, as with the rest of the steps in the OAuth2 section of the BitX API docs), and the API’s behaviour didn’t quite match the rest of the API, and I never got around to testing it properly (by asking for a developer key). I have communicated with the BitX developers, and apparently a revision of the Auth endpoint is in the works, so I shall consider re-introducing the bindings once that work has been done.

Creating smaller sub-modules

I divided the main Network.Bitcoin.BitX.Private module into smaller sub-modules. Nothing to see here…

Adding a new Create Account function

The BitX developers created a new API endpoint for creating an account. I had to wait a day or two before releasing the library while they fixed an issue I discovered during testing, as well as another, milder, issue with the endpoints for withdrawals.

In conclusion

Apparently, announcing the release of this library on Reddit was A Good Idea™. It’s amazing how much one can improve their code and understanding once they ask others to take a look at their work.

Incidentally, reports about this legendary helpfulness and friendliness of the Haskell community was one of the main reasons which led me to start learning Haskell over LISP a few years ago, after I had made up my mind to find out what the fuss over functional programming was all about.

On a completely unrelated note, coding against the BitX API is starting to feel like aiming at a moving target. So much so, in fact, that for a moment I considered abandoning the principles of the semantic versioning system and using minor versions for API changes. It looks strange to release a major revision after only a week, and I have a feeling that this won’t be the last time a change in the BitX endpoints will prompt me to review my library interface again. This fact, coupled with how developers typically refer to library version numbers in their Cabal files, means that a revision of this library could potentially become “bitrotten” in a matter of a few weeks. So maybe you should just not have upper bounds — and use qualified imports — when using this library…

Oh, by the way, I used hscolour to generate the highlighting for the code snippets in this post (because free WordPress does not support Haskell code highlighting, and F# turns out to be a terrible proxy as evidenced by my previous blog post). Next Haskell post I will try to use BlogLiterately.

Posted in Haskell, Programming | Tagged , , , | Leave a comment

Trials and tribulations of writing my first “real” Haskell package

A little over a month ago I decided to be serious and write my first truly public (meaning “Hosted on Hackage for anyone to use easily”) Haskell project. I had the unfortunate idea of chronicling my attempt in a blog post, in the spirit of many other previous blog posts (see, for example I’m hoping that this particular blog post would provide a slightly different perspective, seeing as how I’m a Newbmeister from N00benstein and I’m still figuring this stuff out.

Note: It’s probably worth pointing out that the code snippets below are for illustrative purposes only. If you are suspicious, and wish to see valid code, go view the first release version of the actually source code.

A little background on me

I consider myself a Haskell beginner. At least, even though I have tackled 65 problems on Project Euler in Haskell (though, bizarrely, I can’t recover my account ever since I lost my account-manger-generated password) and have written one or two very small utilities for use in my linguistics’ research, and I have a non-negligible amount of experience with Java, I have never written any “real world” Haskell code. I do not consider number theory and list tricks — the type of stuff necessary to solve Project Euler problems, for instance — to be real-world. It’s super-awesome and super-important, but a mastery of knot-tying, and only knot-tying, hardly makes one a productive programmer. In particular, it is well-acknowledged that although there are tonnes of beginner articles and tutorials on Haskell, and a bucket-load of articles and papers outlining type-level acrobatics using terrifying lists of GHC extensions, there is a definite dearth of intermediate-level posts and tutorials. I do not claim to be an intermediary Haskell developer, or even a productive Haskell developer, but I believe that one needs to be an intermediate developer to start being productive. To this end, I was on the lookout for a problem to solve, which was niche enough that I wouldn’t be worsening the signal-to-noise ratio, yet not so specialised as to prove useless to everybody else (for examples see the other projects on my Github profile…)

The problem I sought to solve

Bitcoin is awesome. There are numerous exchanges which provide you the opportunity to lose a few thousand dollars in a day, and a rapidly growing number of online shops now accepting bitcoin as payment. However, to get in the bitcoin game, one needs to buy some coins first, and laws regarding money laundering and terrorism mean that there aren’t a whole lot of places to convert cash into bitcoins easily. As far as I’m aware, BitX is the only legal (following KYC and FICA identification laws) way for South Africans to buy bitcoin. Last month I found out that not only do they expose a(n ostensibly) RESTful API, but they also already have 3rd party bindings in Go, NodeJS, Ruby, and Android, but none in Haskell. Challenge accepted.

Getting started


I will not go into beginning a new Haskell library. There are various tutorials on that on the intertubes, such as this excellent article by Chris Allen.

Project outline

So, I need to provide a Haskell interface to a JSON-driven REST API. This means we need

  • Some way to connect to the BitX service. We need some way to make GET, POST, PUT, and DELETE http requests to an https end-point, and consume the response body (which will usually be in JSON format).
  • Some way to consume the JSON response, and convert it to some data-type so that the user can lose a lot of money.

Choosing an IDE

Every decent beginner-friendly language has a decent IDE and/or editor, right? I’ve used TPX for Turbo Pascal and IDLE for Python. Java programmers are spoiled for choice, and the canonical C# IDE is also freaking amazing. I don’t know what the fuck’s up with Haskell. The state of editing tools in Haskell is a bit sad. As a lowly Haskell beginner I would like to just concentrate on writing code and getting helpful visual feedback, rather than wrangling Vim and EMacs. I know, it’s scandalous that I just want Haskell to work when I sit down with my laptop, right? Now, I used to have the perfect Haskell IDE — or as close to one as I was willing to get without sacrificing a she-goat to the full moon — thanks to the incredible Haskell-Vim-Now, but that all went out the window ever since GHC-MOD has starting acting up and making almost the entirety of the Haskell IDE ecosystem not-entirely-useful (or at least not trivially useful). Haskell IDEs and IDE extensions which rely on various 3rd party tools are unfortunately fragile (at least, judging from the days wasted trying to figure out what’s up with my Vim setup). Fortunately, Leksah — while allegedly not quite as good as a working Vim or EMacs setup — is robust and generally works out of the box, and I eventually settled on it as my primary development environment. FPComplete’s online FP Haskell Center is also pretty sweet, especially when I was away from my Ubuntu laptop (my library choices mean that the project does not work in Windows; sorry).

Choosing tools

We need to parse JSON, so we’ll make the obvious choice and try to figure Aeson out as we go along.


I needed to convert JSON data into Haskell records. Luckily, Aeson tutorials are a dime-a-dozen, and some of them are even helpful. But let’s get that out of the way; did you know about Data.Aeson.TH? It’s pretty shweet. But first…

Aside: the Haskell records’ problem

To use Data.Aeson.TH we need to do something like this:

data RecordDeal = RecordDeal {
    artiste :: Text,
    album :: Text,
    wackness :: Int

$(deriveJSON id ''RecordDeal)

Now, imagine doing this for a REST API. There would be a dozen fields all called id and status. Haskell “records” are actually just normal data-types with a little syntactic sugar, and in particular the “field names” are actually functions in global scope. So we can’t have two records with the same field names, and we probably shouldn’t have a field called id as that would clash with the identity function. The ugly “solution”, for now, is to prefix the record field names with some unique prefix such as the name of the record. Data.Aeson.TH acknowledges this, and thus allows us to modify field names when generating Aeson instances (notice the drop 11).

data RecordDeal = RecordDeal {
    recordDeal'artiste :: Text,
    recordDeal'album :: Text,
    recordDeal'wackness :: Int

$(deriveJSON (drop 11) ''RecordDeal)

This works, but terribly inconveniences the end-user of our library. Luckily Nikita Volkov has heard the lamentations and gnashing of teeth of the Haskell and decided to do something about this.

Volkov records

Volkov’s Record library solves “the record problem” (field names clashing) using some DataKinds trickery. Note that it doesn’t go all the way and provide extensible records &c, but it’s perfect for when we just want to use records sanely. So, we

  1. Send a request to BitX,
  2. Get a JSON response,
  3. Parse the JSON data into a temporary Haskell record, and then
  4. Return a Volkov record back to the user.

HTTP requests

Surprisingly, it’s not straightforward in Haskell to contact a REST endpoint hiding behind TLS. Simple http is pretty straightforward, but as soon as you throw SSL in there you are in trouble. Ultimately, I found three choices:

  • Wreq is the new cool kid on the block, but it felt a bit too big to me, requiring a tonne of transitive dependencies (looking at you, lens).
  • The BitX API reference presents examples using curl, so is it too much to ask for a simple curl-like library in Haskell? The curl package provides just that, a bit too literally, and I wasn’t satisfied with the fact that it is a simple set of bindings to the 3rd party libcurl.
  • Eventually I settled on HTTP-Conduit. Maybe that’s ironic since I decided that wreq had too much baggage, but when I hit upon this package I was frustrated and just wanted to start writing some code.

I think that choosing HTP-Conduit was probably a good choice, though unfortunately it means that my library refuses to build on Windows for some reason (I haven’t bothered to figure out the exact course, but somewhere someone wants to install the HTTP package, which seems impossible to build on Windows, and that ain’t my fault…)


We will be working with financial data, and the BitX API reference has an ominous warning:

Prices and volumes are always represented as a decimal strings e.g. “123.3432”. We use strings instead of floats to preserve the precision.

One should not use floating point for financial code. I needed some way of dealing with financial numbers without using potentially-error-prone floating point, and eventually settled on the Decimal package, which provides an easy-to-use data-type with all the numerical instances one would expect (and a few you shouldn’t; see note #4 here). Their workaround — putting quotes around numbers — resulted in a minor complication I will go into later on.

Writing some code

After a few days of omphaloskepsis, it was time to write some code. For illustration, let us consider a single BitX endpoint which doesn’t require credentials (I refer to this as the “public API”): the public ticker. We will send a GET request to, with a single get parameter pair set to the currency pair we wish to check, and it will return data as follows

  "ask": "1050.00",
  "timestamp": 1366224386716,
  "bid": "924.00",
  "rolling_24_hour_volume": "12.52",
  "last_trade": "950.00"
  "pair" : "XBTZAR"

Note the highlighted line, which was left out of the BitX documentation. Eventually, we wish to give the user a Volkov record as follows

    {ask = 1050,
     timestamp = ,
     bid = 924,
     rolling24HourVolume = 12.52,
     lastTrade = 950,
     pair = XBTZAR} |] :: Ticker

I’ll talk about how I dealt with timestamps and money values in a little bit. The function to provide this looks a little like this

getTicker :: CcyPair -> {some kind of return type}

But there are a few complications.

Complication 1

Each BitX API endpoint can either return some data, or return an error record. So, rather than just having Ticker as our return type, we need to do something like

Either BitXError Ticker


type BitXError =
        {error :: Text,
         errorCode :: Text} |]

Complication 2

The real world is scary. The http call could fail, or BitX might barf and return bad data. It looks like I actually need something more akin to this

Maybe (Either BitXError Ticker)

At least, that’s what I initially chose. Returning Nothing, instead of giving a bit more info about the error, is a bit of a cop out, though, so I had to come up with something a bit better. So let’s enumerate all the possible outcomes of a call:

  1. It works perfectly, and we can parse and return the Volkov record we were expecting,
  2. BitX returns an error record, and we can parse and return the error Volkov record we use for everything,
  3. The networking code could throw an exception (such as if your internet connectivity goes down), in which case all we have really is the Exception text, or
  4. The network code works, but BitX returns some unparseable data (not the type we anticipated, nor a BitXError), in which case all we can do is just give the user the full HTTP repsonse (headers, body, &c) which BitX gave us.

These possibilities suggest the following algebraic data type:

data BitXAPIResponse rec =
      ExceptionResponse Text
    | ErrorResponse BitXError
    | ValidResponse rec
    | UnparseableResponse (Response ByteString)

Note that Response ByteString is precisely what HTTP Conduit returns when we make a web call, and this type can be queried for HTTP response code, response body, and so forth.

Complication 3

This is more common sense, really. Since we are changing the real world, and the answers we get are not pure, we need to wrap the whole thing in the IO monad.

IO (BitXAPIResponse rec) 

This is true in general of any properly-written library with an IO component. As an interesting exception, consider the OEIS library: it makes sense in this case to unwrap the IO actions with the notorious unsafePerformIO, since although the library needs to access a web service to fetch an integer sequence, the responses are predictable and “pure” as each sequence is uniquely identified.

Complication 4

This one was not straightforward to figure out a fix for. BitX has interesting ideas regarding the format of timestamps and financial numbers. As noted above, they put numbers in quotation marks to “preserve the precision.” However, the  FromJSON instance created by Aeson (rightfully) refuse to see numbers in quotes as numbers, even though this “wouldn’t” be a problem in Javascript. Moreover, they do not follow their own convention all the time, as evidenced by the examples for the two transactions’ endpoints. The timestamp format is in milliseconds since Unix Epoch, so we can’t just rely on Aeson to automagically create FromJSON instances for Haskell records which will have to parse BitX timestamps. These two issues had me scratching my poor n00b head for a long time. Should I abandon $(deriveJSON) and just write a few dozen FromJSON instances by hand? Should I risk the ire of library users by creating orphan instances in the form of instance FromJSON Decimal? I knew there must be a better way to solve what seemed like a non-novel problem, so at one of our Lambda Luminaries‘ meetups I asked around and fellow member Theunis told me about what turned out to be a rather well-known trick: creating a newtype with a FromJSON instance, and a function to translate from the newtype to a Decimal. In Haskell, you can have your cake and eat it, too. The situation is similar for the timestamps, though in this case I also had to figure out how work with time in Haskell. This blog post by Vincent Hanquz helped a lot.

-- | Wrapper around Decimal and FromJSON instance, to facilitate automatic JSON instances

newtype QuotedDecimal = QuotedDecimal Decimal deriving (Read, Show)

instance FromJSON QuotedDecimal where
  parseJSON (String x) = return . QuotedDecimal . read . Txt.unpack $ x
  parseJSON (Number x) = return . QuotedDecimal . read . show $ x
  parseJSON _ = mempty

qdToDecimal :: QuotedDecimal -> Decimal
qdToDecimal (QuotedDecimal dec) = dec

-- | Wrapper around UTCTime and FromJSON instance, to facilitate automatic JSON instances

newtype TimestampMS = TimestampMS Integer deriving (Read, Show)

instance FromJSON TimestampMS where
  parseJSON (Number x) = return . TimestampMS . round $ x
  parseJSON _ = mempty

tsmsToUTCTime :: TimestampMS -> UTCTime
tsmsToUTCTime (TimestampMS ms) = timestampParse_ ms

timestampParse_ :: Integer -> UTCTime
timestampParse_ = posixSecondsToUTCTime
    . fromRational . toRational
    . ( / 1000)
    . (fromIntegral :: Integer -> Decimal)

Notice how in the above code snippet I am converting a JSON String and a Number to a QuotedDecimal (because I could get either one from the BitX API), and how natural the parsing code is. I’m also importing Data.Text qualified as Txt. Then per record type, we do something like this:

type Ticker =
  {ask :: Decimal,
   timestamp :: UTCTime,
   bid :: Decimal,
   rolling24HourVolume :: Decimal,
   lastTrade :: Decimal,
   pair :: CcyPair} |]

data Ticker_ = Ticker_
  { ticker'timestamp :: TimestampMS
  , ticker'bid :: QuotedDecimal
  , ticker'ask :: QuotedDecimal
  , ticker'last_trade :: QuotedDecimal
  , ticker'rolling_24_hour_volume :: QuotedDecimal
  , ticker'pair :: CcyPair

$(AesTH.deriveFromJSON AesTH.defaultOptions{AesTH.fieldLabelModifier = last . splitOn "'"}

instance BitXAesRecordConvert Ticker Ticker_ where
  aesToRec (Ticker_ ticker''timestamp ticker''bid ticker''ask ticker''lastTrade
    ticker''rolling24HourVolume ticker''pair) =
  [record| {timestamp = tsmsToUTCTime ticker''timestamp,
    bid = qdToDecimal ticker''bid,
    ask = qdToDecimal ticker''ask,
    lastTrade = qdToDecimal ticker''lastTrade,
    rolling24HourVolume = qdToDecimal ticker''rolling24HourVolume,
    pair = ticker''pair} |]

Wait a second! What the hell is that instance BitXAesRecordConvert on line 22??!! Well, I’m glad you asked.

Type-level acrobatics

It’s probably worth pointing out that at this point I am already using  a small list of GHC extensions: QuasiQuotes, TemplateHaskell, DataKinds (this last one is only needed in GHC 7.10 and upwards because of the code generated by Record’s Template Haskell), all because of the Record package. We are going to double that number in just a bit. As mentioned before, and as should be obvious from thinking about the problem I was trying to solve (interfacing with a tonne of REST endpoints which each return their own special JSON data), I need to write similar code to what I did with Ticker and Ticker_ a lot (incidentally, I wish I knew enough Template Haskell to whittle down all that boilerplate to something better, a la the ridiculous source code for Record.Types), and although I could have invented a Haskell-record-to-Volkov-Record conversion function for every record type, this would have sacrificed reuseability (see also the network code below). I needed some way to say

  • If I give you an A_, it can be converted to an A, and
  • It should happen automatically, and the conversion function should be polymorphic.

I viewed this as sort of a mapping between types, where rather than saying that 2 maps to 4, I can say that A_ maps to A, and define this mapping for each element of the domain. You know what? I think I might have seen something along those lines while half-heartedly reading tutorials on GHC extensions… Well, as it turns out, the answer to “How do I establish a relationship between types” is FunctionalDependencies, and MultiParamTypeClasses (since you need an extension to have a typeclass with more than one parameter in Haskell), and apparently also FlexibleInstances, according to the GHC error messages…

class (FromJSON aes) => BitXAesRecordConvert rec aes | rec -> aes where
    aesToRec :: aes -> rec

Trying to just do a typeclass without FunctionalDependencies, “could” have worked, except that the extension gives us special super powers. In particular, the core API code does something like this (this is very crappy pseudocode, not Haskell):

doIt link = httpCall link --> Aeson_decode --> aesToRec --> return_result

getTicker ccy = doIt ( :: Ticker 

In this case, although doing this without FunctionalDependencies would have “worked” (sort of), the compiler complains that “Yes, I see you want a Ticker as a result, but I have no idea what type you are expecting as the input of aesToRec, and so I can’t type-infer Aeson_decode.” By using FunctionalDependencies and stating “| rec -> aes“, I am saying “If I tell you I want a Ticker, and I have told you that BitXAesRecordConvert Ticker Ticker_, then that means that the input to aesToRec has to be a Ticker_.” That is, it makes aesToRec a one-to-one function between types. So doIt can be a fully polymorphic function, where I only have to worry about giving it the final Volkov record type I want, and it will figure out exactly what type of JSON conversion is needed. FlexibleInstances seems like it’s needed because some of the instances look more like

instance BitXAesRecordConvert [Balance] Balances_ where

where the BitX call returns a simple array wrapped in an object with one field (which Aeson parses to Balances_), and I would rather return a list of objects rather than a thin wrapper over a list. In this instance, we “can’t” use a specific list type in an instance declaration without enabling the FlexibleInstances extension.

Putting it all together

For “completeness'” sake (there is actually a whole lot I am leaving out; see below), here is the code used for connecting to the public API through GET:

bitXAPIRoot :: String
bitXAPIRoot = ""

consumeResponseBody_ :: BitXAesRecordConvert rec aes => Either SomeException (NetCon.Response BL.ByteString)
    -> IO (BitXAPIResponse rec)
consumeResponseBody_ resp =
    case resp of
        Left ex -> return $ ExceptionResponse . Txt.pack . show $ ex
        Right k -> bitXErrorOrPayload k

bitXErrorOrPayload :: BitXAesRecordConvert rec aes => Response BL.ByteString -> IO (BitXAPIResponse rec)
bitXErrorOrPayload resp = do
    let respTE = Aeson.decode body -- is it a BitX error?
    case respTE of
        Just e  -> return . ErrorResponse . aesToRec $ e
        Nothing -> do
            let respTT = Aeson.decode body
            case respTT of
                Just t  -> return . ValidResponse . aesToRec $ t
                Nothing -> return . UnparseableResponse $ resp
        body = NetCon.responseBody resp

simpleBitXGetAuth_ :: BitXAesRecordConvert rec aes => BitXAuth -> String -> IO (BitXAPIResponse rec)
simpleBitXGetAuth_ auth verb = withSocketsDo $ do
    response  IO (BitXAPIResponse [Balance])
getBalances auth = simpleBitXGetAuth_ auth "balance"

The last two lines are in the private API since they require a BitXAuth Volkov record (which simply has a key ID and secret). NetCon is Network.HTTP.Conduit, and on lines 24 to 28 is typical usage of using it to make a GET call with basic authentication (exactly the same as curl -u user:pass). Notice the use of try on line 25, which, coupled with line 29, means that we will catch all possible exceptions thrown by the connection code. consumeResponseBody_ then checks to make sure there was no exception, and passes on the data to bitXErrorOrPayload which tries to parse a BitXError Volkov record, and failing that tries to parse the Volkov record type we are expecting, and failing that returns Nothing (because BitX returned data in a funny format which was neither an error nor the data we expected). The FunctionalDependencies extension gives GHC enough clues so that it can type-infer this whole mess without a hitch. And that’s almost the entirety of the magic in the actual business logic.

Extra thoughts

There were a few more problems I had to solve, which I did not go into above, but which I still wish to note.

  1. The common saying that “if it compiles then it probably works” is obviously not true, though in Haskell it gets very close to being true. In particular, with this project I was parsing blobs of text from the Real World, and any form of misinterpretation would result in bugs in my code. The library was already compiling perfectly well, before I discovered the issue with quoted numbers, for example.
  2. Related to the above, I had to figure out unit testing. I chose HSpec, though I now wish I had started with Tasty instead.
  3. In the past, when I was working through Project Euler, I would install every package I needed (and a whole lot more I didn’t need) globally, and after a while I learned about Cabal Hell the hard way. It’s The Future™ now, and we should all be using minimal GHC installs and Cabal sandboxes. Personally I do not use Stackage yet since I just need a little sanity — I do not need enterprise-level stability.
  4. Making this library Hackage/Github-friendly meant a lot of work on stuff that wasn’t source code. For example, setting up Travis-ci and getting it to do parallel multi-GHC installs, as well as writing a bunch of Haddock documentation. For the love of J. F. Christ please include Haddocks in your library code, including usage examples. Remember how it used to feel when you couldn’t use seemingly amazing Haskell libraries because the authors thought that simply providing type declarations should be enough for anyone to figure everything out? Yeah, newbs like me still feel like that.


I dig Haskell like, a lot. I learned a bunch while writing this library which now seems almost trivial in retrospect. Here’s hoping that my thoughts will help someone else so they wouldn’t have to discover some of this stuff themselves.

Posted in Haskell, Programming | Tagged , , | 2 Comments

On marking surface tone in Sesotho

I may very often write posts about Sesotho with special emphasis on its tonology, or some other aspect which touches upon it.

It will thus be necessary to mark tone on Sesotho words, and this will require a system of marking tone which is precise without being burdensome.

It just so happens that Sesotho — and most Bantu languages in general — readily lends itself to a rather simple system of tonal marking and categorisation.

The surface tones of Sesotho

In Marlo’s Verb tone in Bantu languages[1] he shows how the tonal systems of the Bantu languages fall into 3 broad categories when one considers surface tonemes (that is, between underlying tonemes[2] — how the words “look” in the speakers’ lexicon, and the spoken allotones — how the relative pitch of tone-bearing-units varies):

  1. The stereotypical languages, which distinguish between H(igh) and (L)ow or toneless (∅) tones, with L/∅ being the default tone
  2. The reversive languages, which have essentially reversed the original Bantu tones, and have H as the default tone, and
  3. The predictable languages, where tone is reflected in the spoken language but the lexicon makes no tonal distinctions (speakers use pitch, but the pitch depends on the grammar and not on the words themselves).

We could also include those few rare languages which have lost tone, such as Kiswahili, as a final category.

Sesotho is stereotypical. It pretty much preserves the proto-Bantu system (modulo phonological changes including the loss of long vowels) and in those many instances where reconstructed proto-Bantu words have survived into the modern language, the reflexes almost always have essentially the same tones (underlyingly) as proto-Bantu.

Transcribing tone in Sesotho

For a stereotypical language, if we need to transcribe tone, we essentially need only mark high tones on tone-bearing-units.

Moreover, to distinguish between a word which has not been marked for tone, and one which has no high tones, we will mark the first syllable of the stem of a word with only low tones.


  • Mohópólo — a thought, idea; tonal pattern [_ ¯ ¯ _]
  • Monákaládi — the plant cyperus usitatus; tonal pattern [_ ¯ _ ¯ _]
  • Mochòchonono — a long tail, a comet; tonal pattern [_ _ _ _ _]
  • -thúńthetsa — to stander, vilify; tonal pattern [¯ ¯ _ _]

Note in particular that I have marked the individual syllables of the verb in the fourth example. This is near-nonsensical, and a more appropriate system of marking tone for verbs (underlyingly) will be introduced later.

[1] Marlo, Michael R. 2013. Verb tone in Bantu languages: micro-typological patterns and research methods. Africana Linguistica XIX. 137-234.

[2] Take a look at autosegmental phonology for a more exact phrasing of this word “underlying.”

[3] Note the glaring lack of narrow vowel marking. This will be dealt with in a later post.

Posted in Uncategorized | Tagged , , , , | Leave a comment