Watch App Development Blog – Week 2

In Week 1, I got a (very) basic Haskell REST web service running that scraped the Transperth site for live train times. Now we’re up to:

Step 2: Build & Deployment

Like most developers working on side-projects, I don’t want to pay a bundle for hosting a service during development when it really doesn’t need many resources, however when the product goes live and inevitably becomes a raging success, I need to be able to scale capacity quickly & easily. In the past I’ve used freemium PaaS providers like Heroku and AppHarbor which are designed for exactly this scenario.

I started down the Heroku path using Joe Nelson’s buildpack, however I immediately hit Heroku’s 15 minute build timeout. There are a variety of ways around this, although I got to thinking (as I’ve pondered in the past about AppHarbor) why I need to build on my hosting provider. Heroku was originally designed for deploying apps written in Ruby that didn’t need compilation; pushing source & compiling on the server seems like a hack to me.

Docker is the new hotness in packaging and application deployment, and is better suited to building a compiled web application locally and deploying to a cloud host. I thought I’d give this a go.

Docker Development on OS X

The Docker host relies on specific features of the Linux kernel, which means that working with containers locally on OS X or Windows requires running them inside a Docker host in a Linux VM. This starts to get a bit onioney. My initial inclination was to do docker development using Vagrant – the same method I use for working on other web systems targeting a Linux host. After spending considerable time trying out different methods of running Docker through Vagrant, I ended up coming to the conclusion that it wasn’t worth the hassle for a simple deployment like this one. Instead, my model would be:

  1. While I’m developing locally, just run the service directly on OS X without using Docker.
  2. When I’m ready to deploy, spin up boot2docker and build the container
  3. Commit & push the image to a remote docker repo.
  4. Deploy the image to the cloud host from the repo.

I strongly recommend getting started with Docker using Chris Jones’ “Missing Guide”. I installed using the downloadable installer rather than homebrew, but the only real config change I needed to make was to give the boot2docker VM more RAM – GHC struggles a bit unless it has plenty. Run the command boot2docker config > ~/.boot2docker/profile, then edit the ~/.boot2docker/profile file and change the ‘Memory’ setting (I gave it 4096). I didn’t configure any port-forwarding as I’m only using docker to build the image.

Building a Haskell Docker image

Dockerhub has an official Haskell image, which is a good starting point for development. I implemented a Dockerfile starting from the example at the end of the README. I needed to add an extra step to cater for my gps-1.2 requirement which is (still) not available on Hackage yet at time of writing.

FROM haskell:7.8

RUN cabal update

# Add .cabal file
ADD ./perthtransport.cabal /opt/app/perthtransport.cabal

# Install gps-1.2 from source
ADD gps /opt/app/gps
RUN cd /opt/app/gps && cabal install

# Docker will cache this command as a layer, freeing us up to
# modify source code without re-installing dependencies
RUN cd /opt/app && cabal install --only-dependencies -j4

# Add and Install Application Code
ADD . /opt/app
RUN cd /opt/app && cabal install

# Add installed cabal executables to PATH
ENV PATH /root/.cabal/bin:$PATH

EXPOSE 3000

# Default Command for Container
WORKDIR /opt/app
CMD ["perthtransport"]

I also needed to create a .dockerignore to ensure the cabal sandbox was excluded from the context. Once this was done, my build process consisted of running:

boot2docker up
docker build -t <repo:tag>
docker push <repo:tag>
boot2docker down

Container Hosting in the cloud

Unfortunately the container hosting landscape seems a bit immature at present – I’d love to have a Heroku-like service that lets me deploy scalable containers as simply as using a docker push. Also, while docker is standardised at the container level, most providers (ECS, Digital Ocean etc) seem to be inventing their own clustering layers on top. Maybe swarm will fix that – let’s wait and see.

I ended up going with Tutum – they have a good-looking, self-explanatory web interface, a web service API, and a CLI tool (brew install tutum). They don’t do the hosting themselves though – you need to register your own cloud host account (AWS, Azure, Digital Ocean) with them & they manage the nodes for you. They do give you a private repository, plus the service is ‘free forever’ if you sign up as a developer now. I’m using an AWS t2.micro instance under the free usage tier as the only node at present.

I set up the initial service definition via the web UI, to redeploy the latest image from the repo, I just need to do a tutum service redeploy <imageid>.

Scripting the deployment

I used rake as a build scripting tool, for no other reason than that’s what I normally use for Xcode builds. The process is simple enough that you could probably just use a bash script though.

task :run do
  sh "cabal install --only-dependencies"
  sh "cabal build"
  sh "dist/build/perthtransport/perthtransport"
end

task :deploy do
  version = File.open("perthtransport.cabal").read().match(/^version:\s*([^\s]*)$/)[1]
  puts "Building version #{version}"
  begin
    sh "boot2docker up"
    sh "docker build -t #{DOCKER_REPO}:#{version} ."
    sh "docker push #{DOCKER_REPO}:#{version}"
    sh "tutum service redeploy #{TUTUM_SERVICE_ID}"
    sh "git tag -a #{version} -m 'Build #{version}' & git push origin tag #{version}"
  ensure
    sh "boot2docker down"
  end
end

So now I can build & run locally with a rake run and deploy to an AWS node with rake deploy. Next week we’ll start on the actual watch app functionality. In the interim, the source code is available on bitbucket.

Advertisements

Watch App Development Blog – Week 1

Okay, I’m trying something new – a weekly blog post talking about my progress developing an app. The theory is that the thought of my massive readership expectantly waiting for the next update will give me enough of an incentive to get something finished. Everyone practise their sad face to make me feel guilty if I don’t post an update.

I’m keen to get out an Apple Watch app. To be honest, I don’t think they’ll make much money, but most apps I build are for other people; it would be nice to put out something good, so I can say “I did that!”

The Concept

Over the last few months, I’ve tried to be aware of instances where I need some information off my phone, but pulling it out & launching an app seems like too much of a hassle. One scenario I noticed was when I was heading to the train station – I used to know the departure times off by heart, but now I’m not sure whether to run or dawdle. It would be great if there was an app on my watch that gave me live departure times for my nearest station – challenge accepted!

Step 1: The API

Transperth don’t have a public API, although there’s an unofficial third-party one that scrapes the website. Unfortunately some parts were broken with a recent site update, and the developer now lives in Melbourne. I also have a few ideas for some custom API behaviour, so I made a probably ill-advised decision to build my own scraping API.

I really wanted to try out a web project in F#, but I didn’t want to develop on Windows and I ended up running into significant problems with Xamarin – broken project templates, unimplemented parts of the aspnetwebstack, etc. ASP.NET vNext looks promising, but I had issues with it also.

So I thought I’d give Haskell another go – I’ve tried this in the past, but I’m much gooder at Haskell now. The state of web frameworks in Haskell has also improved significantly since 2010. I went with Scotty – I like the simplicity of the Sinatra/NancyFx model, and there’s a great walk-through by Aditya Bhargava.

Haskell Web Development on OS X

If you’re playing along at home, you’ll need to follow the following steps to run the API:

  1. Install the Haskell Platform
  2. Run cabal sandbox init in your project directory. Cabal sandbox installs dependencies in a project scope, similar to Bundler in Ruby.
  3. Create a cabal file specifying your dependencies. This process I found a little odd – effectively you’re specifying your executable as a library, but it allows you to leverage cabal dependency resolution. Use Adit’s cabal file as a base.
  4. Create a Main.hs and add your Scotty routes (check out the examples).
  5. Run cabal install && .cabal-sandbox/bin/<executable name>

After an extended compile time, you should now have a web server running on localhost:<port>.

JSON Response Types

Returning JSON can be done by defining record types that implement the ToJSON type class (from Aeson):

{-# LANGUAGE DeriveGeneric #-}
module Types where

import Data.Aeson
import GHC.Generics

data Departure = Departure { time :: String, destination :: String, pattern :: String, status :: String } deriving (Generic, Show)
instance ToJSON Station

HTML Parsing

Parsing the DNN-generated web page is done using tagsoup. This differs from most other HTML parsing libraries I’ve used in that it doesn’t define a query API or CSS-like selector syntax over a DOM, it just converts the HTML into a flat list of nodes that can be manipulated using regular list functions.

My scraping function, which is probably not brilliant Haskell, looks like the following (excluding some helpers):

getTrainTimes :: String -> IO [Departure]
getTrainTimes x = do tags <- fmap parseTags $  openURL $ "http://www.transperth.wa.gov.au/Timetables/Live-Train-Times?stationname=" ++ (urlEncode x)
                     let table = head $ tables tags -- first table in page
                     let rowArray = reverse . tail . reverse . tail $ rows table -- strip first & last rows
                     let times = map (f . cells) rowArray -- convert each row into a Departure
                     return times
                  where f cs = Departure (textFromCell $ cellAtColumn 0 cs) (destFromCell $ cellAtColumn 1 cs) (patternFromCell $ cellAtColumn 2 cs) (textFromCell $ cellAtColumn 3 cs)

‘Stations Near Me’

In addition to querying for live times, I also have a flat text file of station names & locations I lifted from Darcy’s project. This is used to respond to the ‘all stations’ API call, and also supports a geospatial query endpoint using the gps package. Initially I started getting build failures with this dependency – it was trying to compile GPX file support, which I don’t need. The latest version (1.2) of the gps code has removed this dependency, but it’s not on Hackage yet.

Happily, this is solvable:

  1. Specify the specific version of the package in your cabal file: gps >=1.2
  2. Put the source code in your project directory (e.g. git submodule add git@github.com:TomMD/gps.git)
  3. Specify the new source directory with cabal sandbox add-source gps

Using the Geo.Computations model, it’s then fairly straightforward to filter the list of stations based on distance to a given point.

Bringing it Together

Once the stations, live times, and geospatial filtering was done, it was just a case of defining the appropriate route functions in Scotty:

  scotty port $ do
    get "/train/" $ do
      list <- liftIO stations
      json list

    get "/train/near" $ do
      y <- param "lat"
      x <-param "long"
      list <- liftIO $ stationsNear y x
      json list

    get "/train/:station" $ do
      stationId <- param "station"
      station <- liftIO $ station stationId
      case station of
        Just s -> do
          times <- liftIO $ getTrainTimes $ name s
          json times
        Nothing ->
          Web.Scotty.status status404

I haven’t touched cache control or more advanced error handling, and it would be nice to fall back on timetables if live times aren’t available, but I now have enough of an API running to support the basic functions of the watch app. One of the things I liked about doing it in Haskell was that once it compiled, it generally worked. It’s a pretty nice feeling.

I’ve put the code up on bitbucket – feel free to have a look through it and send some feedback if you can’t stand my beginner Haskell.

Next Steps

Next is hosting – building and deploying my dinky API somewhere I can reach it. Tune in next week for another thrilling instalment!

Building a REST application server in Haskell Part 1

As mentioned previously, I’m boning up on Haskell. I prefer to learn by doing, so I’m attempting to build a functioning application as part of the process. I’ve struggled to find much entry level (real application) source code, so I’ll be posting my progress as I go in the hope that it will help others undertaking the same journey. Note the emphasis is on ‘entry level’ — I’m not pretending to be a Haskell guru; if you are one, pointing out what I’m doing wrong would be much appreciated.

This application is a hypothetical REST-based continuous integration status aggregator — CI servers publish build results to the server, then mobile clients view the results. It seemed a little bit more interesting than another blog app tutorial. I’ll be using the popular happstack web application framework, serving JSON over its built-in web server.

First, the setup. Assuming you have a GHC distribution and cabal installed on your computer (on OS X, the Haskell platform binary installer includes these), you’ll need to install happstack and JSON:

cabal update
cabal install happstack
cabal install json

If you get happstack build errors, see my previous post for a possible resolution.

First step is returning some JSON from the server. The following code is a minimal implementation that serialises some hard-coded test data and returns it in response to a web request:

{-# LANGUAGE DeriveDataTypeable #-}
module Main where

import Happstack.Server
import Text.JSON.Generic

data Project =
	Project {
		name :: String,
		lastStatus :: String,
		builds :: [Build]
	}
	deriving (Data, Typeable)

data Build =

	Build {
		buildName :: String,
		status :: String
 	}
	deriving (Data, Typeable)

testProjectJSON = encodeJSON (Project "New Project" "Succeeded" [Build "2010-10-02 build 2" "Succeeded", Build "2010-10-01 build 1" "Failed"])

main = simpleHTTP nullConf $ ok $ toResponse testProjectJSON

I’m using Text.JSON.Generic and the DeriveDataTypeable language extension to automatically generate mappings from my custom algebraic data types (Project & Build) to JSON. Using the standard Text.JSON library would require implementing showJSON and readJSON functions manually, eg something like:

instance JSON Project where
	showJSON (Project name lastStatus builds) = showJSON makeObj [("name", toJSString name), ("lastStatus", toJSString lastStatus), ("builds", showJSONs builds)]

My gut feel, although this is probably the OO developer in me talking, is that automatic generation would be better suited to simple serialisation problems, and explicit serialisation more useful in instances where C#/Java devs would think ‘DTO’ (eg producing a flattened, version-tolerant data contract).

Running the application should result in the following highly enlightening data being served from http://localhost:8000/

{"name":"New Project","lastStatus":"Succeeded","builds":[{"buildName":"2010-03-23","status":"Succeeded"},{"buildName":"2010-10-01","status":"Failed"}]}

happstack-util build failure on OS X

Attempting to `cabal install happstack` on OS X resulted in numerous errors, including:

ghc: could not execute: /Library/Frameworks/GHC.framework/Versions/612/usr/lib/ghc-6.12.3/ghc-asm

There’s a bug with the Haskell platform installer for OS X — the shebang line in the file /Library/Frameworks/GHC.framework/Versions/612/usr/lib/ghc-6.12.3/ghc-asm is incorrect (it’s pointing at the MacPorts location for perl, rather than system perl).

Open the file in your text editor of choice and change the first line from

#!/opt/local/bin/perl

to

#!/usr/local/bin

QOTW: “Row Farmers & Stateful Sinners”

The second YOW talk Dave Thomas delivered back on the 30th September was entitled “Why Real Developers Embrace Functional Programming and NoSQL: Data Confessions of an Object’holic and Stateful Sinner”. Needless to say, it contained a fair bit of controversial content that Dave was fairly unapologetic about, including the bold pronouncement that ‘C# and Java will be legacy platforms in 5 years’.

The gist of the argument was:

  • Objects are not terribly good abstractions for most real-world problems.
  • Good object-oriented design is hard, and Morts are still the bread & butter coders out there producing software.
  • Objects are implemented inefficiently in almost all runtime environments — “A good JIT can generate fast code, but it will generate a lot of it”. They also don’t translate well to parallel execution environments.
  • Serialisation/storage is still a problem — sending objects over the wire, or persisting to disk requires complicated, framework-heavy mapping.
  • KLOCs kill!

Dave’s proposed solution is a wholesale movement towards Functional Programming.  This has been contemplated before (ie for the last forty years), but I think there’s more appetite in mainstream development communities today. Functional languages are available for both the Java & .NET runtimes (and have been for some time), but more importantly we’re seeing some FP paradigms pop up in imperative languages — eg Ruby, Python and even Linq in C#. I suspect the future will be much more heavily skewed towards multi-paradigm languages than Dave would prefer, but I can certainly see it happening.

So, count me in as a convert. I’m foolishly attempting to teach myself Haskell by following the excellent wikibook and building toy web apps with Happstack — expect to see some more posts on this subject soon as I muddle my way through.