Trials and tribulations of writing my first “real” Haskell package

A little over a month ago I decided to be serious and write my first truly public (meaning “Hosted on Hackage for anyone to use easily”) Haskell project. I had the unfortunate idea of chronicling my attempt in a blog post, in the spirit of many other previous blog posts (see, for example http://jabberwocky.eu/2013/10/24/how-to-start-a-new-haskell-project/). I’m hoping that this particular blog post would provide a slightly different perspective, seeing as how I’m a Newbmeister from N00benstein and I’m still figuring this stuff out.

Note: It’s probably worth pointing out that the code snippets below are for illustrative purposes only. If you are suspicious, and wish to see valid code, go view the first release version of the actually source code.

A little background on me

I consider myself a Haskell beginner. At least, even though I have tackled 65 problems on Project Euler in Haskell (though, bizarrely, I can’t recover my account ever since I lost my account-manger-generated password) and have written one or two very small utilities for use in my linguistics’ research, and I have a non-negligible amount of experience with Java, I have never written any “real world” Haskell code. I do not consider number theory and list tricks — the type of stuff necessary to solve Project Euler problems, for instance — to be real-world. It’s super-awesome and super-important, but a mastery of knot-tying, and only knot-tying, hardly makes one a productive programmer. In particular, it is well-acknowledged that although there are tonnes of beginner articles and tutorials on Haskell, and a bucket-load of articles and papers outlining type-level acrobatics using terrifying lists of GHC extensions, there is a definite dearth of intermediate-level posts and tutorials. I do not claim to be an intermediary Haskell developer, or even a productive Haskell developer, but I believe that one needs to be an intermediate developer to start being productive. To this end, I was on the lookout for a problem to solve, which was niche enough that I wouldn’t be worsening the signal-to-noise ratio, yet not so specialised as to prove useless to everybody else (for examples see the other projects on my Github profile…)

The problem I sought to solve

Bitcoin is awesome. There are numerous exchanges which provide you the opportunity to lose a few thousand dollars in a day, and a rapidly growing number of online shops now accepting bitcoin as payment. However, to get in the bitcoin game, one needs to buy some coins first, and laws regarding money laundering and terrorism mean that there aren’t a whole lot of places to convert cash into bitcoins easily. As far as I’m aware, BitX is the only legal (following KYC and FICA identification laws) way for South Africans to buy bitcoin. Last month I found out that not only do they expose a(n ostensibly) RESTful API, but they also already have 3rd party bindings in Go, NodeJS, Ruby, and Android, but none in Haskell. Challenge accepted.

Getting started

Preliminaries

I will not go into beginning a new Haskell library. There are various tutorials on that on the intertubes, such as this excellent article by Chris Allen.

Project outline

So, I need to provide a Haskell interface to a JSON-driven REST API. This means we need

  • Some way to connect to the BitX service. We need some way to make GET, POST, PUT, and DELETE http requests to an https end-point, and consume the response body (which will usually be in JSON format).
  • Some way to consume the JSON response, and convert it to some data-type so that the user can lose a lot of money.

Choosing an IDE

Every decent beginner-friendly language has a decent IDE and/or editor, right? I’ve used TPX for Turbo Pascal and IDLE for Python. Java programmers are spoiled for choice, and the canonical C# IDE is also freaking amazing. I don’t know what the fuck’s up with Haskell. The state of editing tools in Haskell is a bit sad. As a lowly Haskell beginner I would like to just concentrate on writing code and getting helpful visual feedback, rather than wrangling Vim and EMacs. I know, it’s scandalous that I just want Haskell to work when I sit down with my laptop, right? Now, I used to have the perfect Haskell IDE — or as close to one as I was willing to get without sacrificing a she-goat to the full moon — thanks to the incredible Haskell-Vim-Now, but that all went out the window ever since GHC-MOD has starting acting up and making almost the entirety of the Haskell IDE ecosystem not-entirely-useful (or at least not trivially useful). Haskell IDEs and IDE extensions which rely on various 3rd party tools are unfortunately fragile (at least, judging from the days wasted trying to figure out what’s up with my Vim setup). Fortunately, Leksah — while allegedly not quite as good as a working Vim or EMacs setup — is robust and generally works out of the box, and I eventually settled on it as my primary development environment. FPComplete’s online FP Haskell Center is also pretty sweet, especially when I was away from my Ubuntu laptop (my library choices mean that the project does not work in Windows; sorry).

Choosing tools

We need to parse JSON, so we’ll make the obvious choice and try to figure Aeson out as we go along.

Aeson!

I needed to convert JSON data into Haskell records. Luckily, Aeson tutorials are a dime-a-dozen, and some of them are even helpful. But let’s get that out of the way; did you know about Data.Aeson.TH? It’s pretty shweet. But first…

Aside: the Haskell records’ problem

To use Data.Aeson.TH we need to do something like this:

data RecordDeal = RecordDeal {
    artiste :: Text,
    album :: Text,
    wackness :: Int
}

$(deriveJSON id ''RecordDeal)

Now, imagine doing this for a REST API. There would be a dozen fields all called id and status. Haskell “records” are actually just normal data-types with a little syntactic sugar, and in particular the “field names” are actually functions in global scope. So we can’t have two records with the same field names, and we probably shouldn’t have a field called id as that would clash with the identity function. The ugly “solution”, for now, is to prefix the record field names with some unique prefix such as the name of the record. Data.Aeson.TH acknowledges this, and thus allows us to modify field names when generating Aeson instances (notice the drop 11).

data RecordDeal = RecordDeal {
    recordDeal'artiste :: Text,
    recordDeal'album :: Text,
    recordDeal'wackness :: Int
}

$(deriveJSON (drop 11) ''RecordDeal)

This works, but terribly inconveniences the end-user of our library. Luckily Nikita Volkov has heard the lamentations and gnashing of teeth of the Haskell and decided to do something about this.

Volkov records

Volkov’s Record library solves “the record problem” (field names clashing) using some DataKinds trickery. Note that it doesn’t go all the way and provide extensible records &c, but it’s perfect for when we just want to use records sanely. So, we

  1. Send a request to BitX,
  2. Get a JSON response,
  3. Parse the JSON data into a temporary Haskell record, and then
  4. Return a Volkov record back to the user.

HTTP requests

Surprisingly, it’s not straightforward in Haskell to contact a REST endpoint hiding behind TLS. Simple http is pretty straightforward, but as soon as you throw SSL in there you are in trouble. Ultimately, I found three choices:

  • Wreq is the new cool kid on the block, but it felt a bit too big to me, requiring a tonne of transitive dependencies (looking at you, lens).
  • The BitX API reference presents examples using curl, so is it too much to ask for a simple curl-like library in Haskell? The curl package provides just that, a bit too literally, and I wasn’t satisfied with the fact that it is a simple set of bindings to the 3rd party libcurl.
  • Eventually I settled on HTTP-Conduit. Maybe that’s ironic since I decided that wreq had too much baggage, but when I hit upon this package I was frustrated and just wanted to start writing some code.

I think that choosing HTP-Conduit was probably a good choice, though unfortunately it means that my library refuses to build on Windows for some reason (I haven’t bothered to figure out the exact course, but somewhere someone wants to install the HTTP package, which seems impossible to build on Windows, and that ain’t my fault…)

Decimal

We will be working with financial data, and the BitX API reference has an ominous warning:

Prices and volumes are always represented as a decimal strings e.g. “123.3432”. We use strings instead of floats to preserve the precision.

One should not use floating point for financial code. I needed some way of dealing with financial numbers without using potentially-error-prone floating point, and eventually settled on the Decimal package, which provides an easy-to-use data-type with all the numerical instances one would expect (and a few you shouldn’t; see note #4 here). Their workaround — putting quotes around numbers — resulted in a minor complication I will go into later on.

Writing some code

After a few days of omphaloskepsis, it was time to write some code. For illustration, let us consider a single BitX endpoint which doesn’t require credentials (I refer to this as the “public API”): the public ticker. We will send a GET request to https://api.mybitx.com/api/1/ticker, with a single get parameter pair set to the currency pair we wish to check, and it will return data as follows

{
  "ask": "1050.00",
  "timestamp": 1366224386716,
  "bid": "924.00",
  "rolling_24_hour_volume": "12.52",
  "last_trade": "950.00"
  "pair" : "XBTZAR"
}

Note the highlighted line, which was left out of the BitX documentation. Eventually, we wish to give the user a Volkov record as follows

[record|
    {ask = 1050,
     timestamp = ,
     bid = 924,
     rolling24HourVolume = 12.52,
     lastTrade = 950,
     pair = XBTZAR} |] :: Ticker

I’ll talk about how I dealt with timestamps and money values in a little bit. The function to provide this looks a little like this

getTicker :: CcyPair -> {some kind of return type}

But there are a few complications.

Complication 1

Each BitX API endpoint can either return some data, or return an error record. So, rather than just having Ticker as our return type, we need to do something like

Either BitXError Ticker

where

type BitXError =
    [record|
        {error :: Text,
         errorCode :: Text} |]

Complication 2

The real world is scary. The http call could fail, or BitX might barf and return bad data. It looks like I actually need something more akin to this

Maybe (Either BitXError Ticker)

At least, that’s what I initially chose. Returning Nothing, instead of giving a bit more info about the error, is a bit of a cop out, though, so I had to come up with something a bit better. So let’s enumerate all the possible outcomes of a call:

  1. It works perfectly, and we can parse and return the Volkov record we were expecting,
  2. BitX returns an error record, and we can parse and return the error Volkov record we use for everything,
  3. The networking code could throw an exception (such as if your internet connectivity goes down), in which case all we have really is the Exception text, or
  4. The network code works, but BitX returns some unparseable data (not the type we anticipated, nor a BitXError), in which case all we can do is just give the user the full HTTP repsonse (headers, body, &c) which BitX gave us.

These possibilities suggest the following algebraic data type:

data BitXAPIResponse rec =
      ExceptionResponse Text
    | ErrorResponse BitXError
    | ValidResponse rec
    | UnparseableResponse (Response ByteString)

Note that Response ByteString is precisely what HTTP Conduit returns when we make a web call, and this type can be queried for HTTP response code, response body, and so forth.

Complication 3

This is more common sense, really. Since we are changing the real world, and the answers we get are not pure, we need to wrap the whole thing in the IO monad.

IO (BitXAPIResponse rec) 

This is true in general of any properly-written library with an IO component. As an interesting exception, consider the OEIS library: it makes sense in this case to unwrap the IO actions with the notorious unsafePerformIO, since although the library needs to access a web service to fetch an integer sequence, the responses are predictable and “pure” as each sequence is uniquely identified.

Complication 4

This one was not straightforward to figure out a fix for. BitX has interesting ideas regarding the format of timestamps and financial numbers. As noted above, they put numbers in quotation marks to “preserve the precision.” However, the  FromJSON instance created by Aeson (rightfully) refuse to see numbers in quotes as numbers, even though this “wouldn’t” be a problem in Javascript. Moreover, they do not follow their own convention all the time, as evidenced by the examples for the two transactions’ endpoints. The timestamp format is in milliseconds since Unix Epoch, so we can’t just rely on Aeson to automagically create FromJSON instances for Haskell records which will have to parse BitX timestamps. These two issues had me scratching my poor n00b head for a long time. Should I abandon $(deriveJSON) and just write a few dozen FromJSON instances by hand? Should I risk the ire of library users by creating orphan instances in the form of instance FromJSON Decimal? I knew there must be a better way to solve what seemed like a non-novel problem, so at one of our Lambda Luminaries‘ meetups I asked around and fellow member Theunis told me about what turned out to be a rather well-known trick: creating a newtype with a FromJSON instance, and a function to translate from the newtype to a Decimal. In Haskell, you can have your cake and eat it, too. The situation is similar for the timestamps, though in this case I also had to figure out how work with time in Haskell. This blog post by Vincent Hanquz helped a lot.

-- | Wrapper around Decimal and FromJSON instance, to facilitate automatic JSON instances

newtype QuotedDecimal = QuotedDecimal Decimal deriving (Read, Show)

instance FromJSON QuotedDecimal where
  parseJSON (String x) = return . QuotedDecimal . read . Txt.unpack $ x
  parseJSON (Number x) = return . QuotedDecimal . read . show $ x
  parseJSON _ = mempty

qdToDecimal :: QuotedDecimal -> Decimal
qdToDecimal (QuotedDecimal dec) = dec

-- | Wrapper around UTCTime and FromJSON instance, to facilitate automatic JSON instances

newtype TimestampMS = TimestampMS Integer deriving (Read, Show)

instance FromJSON TimestampMS where
  parseJSON (Number x) = return . TimestampMS . round $ x
  parseJSON _ = mempty

tsmsToUTCTime :: TimestampMS -> UTCTime
tsmsToUTCTime (TimestampMS ms) = timestampParse_ ms

timestampParse_ :: Integer -> UTCTime
timestampParse_ = posixSecondsToUTCTime
    . fromRational . toRational
    . ( / 1000)
    . (fromIntegral :: Integer -> Decimal)

Notice how in the above code snippet I am converting a JSON String and a Number to a QuotedDecimal (because I could get either one from the BitX API), and how natural the parsing code is. I’m also importing Data.Text qualified as Txt. Then per record type, we do something like this:


type Ticker =
 [record|
  {ask :: Decimal,
   timestamp :: UTCTime,
   bid :: Decimal,
   rolling24HourVolume :: Decimal,
   lastTrade :: Decimal,
   pair :: CcyPair} |]

data Ticker_ = Ticker_
  { ticker'timestamp :: TimestampMS
  , ticker'bid :: QuotedDecimal
  , ticker'ask :: QuotedDecimal
  , ticker'last_trade :: QuotedDecimal
  , ticker'rolling_24_hour_volume :: QuotedDecimal
  , ticker'pair :: CcyPair
  }

$(AesTH.deriveFromJSON AesTH.defaultOptions{AesTH.fieldLabelModifier = last . splitOn "'"}
 ''Ticker_)

instance BitXAesRecordConvert Ticker Ticker_ where
  aesToRec (Ticker_ ticker''timestamp ticker''bid ticker''ask ticker''lastTrade
    ticker''rolling24HourVolume ticker''pair) =
  [record| {timestamp = tsmsToUTCTime ticker''timestamp,
    bid = qdToDecimal ticker''bid,
    ask = qdToDecimal ticker''ask,
    lastTrade = qdToDecimal ticker''lastTrade,
    rolling24HourVolume = qdToDecimal ticker''rolling24HourVolume,
    pair = ticker''pair} |]

Wait a second! What the hell is that instance BitXAesRecordConvert on line 22??!! Well, I’m glad you asked.

Type-level acrobatics

It’s probably worth pointing out that at this point I am already using  a small list of GHC extensions: QuasiQuotes, TemplateHaskell, DataKinds (this last one is only needed in GHC 7.10 and upwards because of the code generated by Record’s Template Haskell), all because of the Record package. We are going to double that number in just a bit. As mentioned before, and as should be obvious from thinking about the problem I was trying to solve (interfacing with a tonne of REST endpoints which each return their own special JSON data), I need to write similar code to what I did with Ticker and Ticker_ a lot (incidentally, I wish I knew enough Template Haskell to whittle down all that boilerplate to something better, a la the ridiculous source code for Record.Types), and although I could have invented a Haskell-record-to-Volkov-Record conversion function for every record type, this would have sacrificed reuseability (see also the network code below). I needed some way to say

  • If I give you an A_, it can be converted to an A, and
  • It should happen automatically, and the conversion function should be polymorphic.

I viewed this as sort of a mapping between types, where rather than saying that 2 maps to 4, I can say that A_ maps to A, and define this mapping for each element of the domain. You know what? I think I might have seen something along those lines while half-heartedly reading tutorials on GHC extensions… Well, as it turns out, the answer to “How do I establish a relationship between types” is FunctionalDependencies, and MultiParamTypeClasses (since you need an extension to have a typeclass with more than one parameter in Haskell), and apparently also FlexibleInstances, according to the GHC error messages…

class (FromJSON aes) => BitXAesRecordConvert rec aes | rec -> aes where
    aesToRec :: aes -> rec

Trying to just do a typeclass without FunctionalDependencies, “could” have worked, except that the extension gives us special super powers. In particular, the core API code does something like this (this is very crappy pseudocode, not Haskell):

doIt link = httpCall link --> Aeson_decode --> aesToRec --> return_result

getTicker ccy = doIt (https://api.mybitx.com/api/1/ticker?pair=ccy) :: Ticker 

In this case, although doing this without FunctionalDependencies would have “worked” (sort of), the compiler complains that “Yes, I see you want a Ticker as a result, but I have no idea what type you are expecting as the input of aesToRec, and so I can’t type-infer Aeson_decode.” By using FunctionalDependencies and stating “| rec -> aes“, I am saying “If I tell you I want a Ticker, and I have told you that BitXAesRecordConvert Ticker Ticker_, then that means that the input to aesToRec has to be a Ticker_.” That is, it makes aesToRec a one-to-one function between types. So doIt can be a fully polymorphic function, where I only have to worry about giving it the final Volkov record type I want, and it will figure out exactly what type of JSON conversion is needed. FlexibleInstances seems like it’s needed because some of the instances look more like

instance BitXAesRecordConvert [Balance] Balances_ where

where the BitX call returns a simple array wrapped in an object with one field (which Aeson parses to Balances_), and I would rather return a list of objects rather than a thin wrapper over a list. In this instance, we “can’t” use a specific list type in an instance declaration without enabling the FlexibleInstances extension.

Putting it all together

For “completeness'” sake (there is actually a whole lot I am leaving out; see below), here is the code used for connecting to the public API through GET:

bitXAPIRoot :: String
bitXAPIRoot = "https://api.mybitx.com/api/1/"

consumeResponseBody_ :: BitXAesRecordConvert rec aes => Either SomeException (NetCon.Response BL.ByteString)
    -> IO (BitXAPIResponse rec)
consumeResponseBody_ resp =
    case resp of
        Left ex -> return $ ExceptionResponse . Txt.pack . show $ ex
        Right k -> bitXErrorOrPayload k

bitXErrorOrPayload :: BitXAesRecordConvert rec aes => Response BL.ByteString -> IO (BitXAPIResponse rec)
bitXErrorOrPayload resp = do
    let respTE = Aeson.decode body -- is it a BitX error?
    case respTE of
        Just e  -> return . ErrorResponse . aesToRec $ e
        Nothing -> do
            let respTT = Aeson.decode body
            case respTT of
                Just t  -> return . ValidResponse . aesToRec $ t
                Nothing -> return . UnparseableResponse $ resp
    where
        body = NetCon.responseBody resp

simpleBitXGetAuth_ :: BitXAesRecordConvert rec aes => BitXAuth -> String -> IO (BitXAPIResponse rec)
simpleBitXGetAuth_ auth verb = withSocketsDo $ do
    response  IO (BitXAPIResponse [Balance])
getBalances auth = simpleBitXGetAuth_ auth "balance"

The last two lines are in the private API since they require a BitXAuth Volkov record (which simply has a key ID and secret). NetCon is Network.HTTP.Conduit, and on lines 24 to 28 is typical usage of using it to make a GET call with basic authentication (exactly the same as curl -u user:pass). Notice the use of try on line 25, which, coupled with line 29, means that we will catch all possible exceptions thrown by the connection code. consumeResponseBody_ then checks to make sure there was no exception, and passes on the data to bitXErrorOrPayload which tries to parse a BitXError Volkov record, and failing that tries to parse the Volkov record type we are expecting, and failing that returns Nothing (because BitX returned data in a funny format which was neither an error nor the data we expected). The FunctionalDependencies extension gives GHC enough clues so that it can type-infer this whole mess without a hitch. And that’s almost the entirety of the magic in the actual business logic.

Extra thoughts

There were a few more problems I had to solve, which I did not go into above, but which I still wish to note.

  1. The common saying that “if it compiles then it probably works” is obviously not true, though in Haskell it gets very close to being true. In particular, with this project I was parsing blobs of text from the Real World, and any form of misinterpretation would result in bugs in my code. The library was already compiling perfectly well, before I discovered the issue with quoted numbers, for example.
  2. Related to the above, I had to figure out unit testing. I chose HSpec, though I now wish I had started with Tasty instead.
  3. In the past, when I was working through Project Euler, I would install every package I needed (and a whole lot more I didn’t need) globally, and after a while I learned about Cabal Hell the hard way. It’s The Future™ now, and we should all be using minimal GHC installs and Cabal sandboxes. Personally I do not use Stackage yet since I just need a little sanity — I do not need enterprise-level stability.
  4. Making this library Hackage/Github-friendly meant a lot of work on stuff that wasn’t source code. For example, setting up Travis-ci and getting it to do parallel multi-GHC installs, as well as writing a bunch of Haddock documentation. For the love of J. F. Christ please include Haddocks in your library code, including usage examples. Remember how it used to feel when you couldn’t use seemingly amazing Haskell libraries because the authors thought that simply providing type declarations should be enough for anyone to figure everything out? Yeah, newbs like me still feel like that.

Conclusion

I dig Haskell like, a lot. I learned a bunch while writing this library which now seems almost trivial in retrospect. Here’s hoping that my thoughts will help someone else so they wouldn’t have to discover some of this stuff themselves.

Advertisements
This entry was posted in Haskell, Programming and tagged , , . Bookmark the permalink.

2 Responses to Trials and tribulations of writing my first “real” Haskell package

  1. Pingback: Revising the BitX haskell bindings | dikgwahlapiso

  2. Pingback: Moving from Multiparameter Type Classes and Functional Dependencies to Type Families in Haskell | dikgwahlapiso

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s