In my previous post (Akka Clustering and Remoting: The Experiment), I defined a ping-pong application and deployed it into a local JVM cluster. In this post, I want to examine how we can scale this application into a vendor's cloud (e.g. Amazon or Rackspace).
However when pushing into the cloud, it is wise to remove any reliance upon a single cloud vendor. So, I'll also look at how that may be accomplished.
Nowadays, as organisations look to make their applications resilent, there are a number of competing technologies that can be used to deploy cloud compute nodes (in a vendor agnostic manner).
I have listed a few such technologies below:
The provisioning software space offers a slightly more restricted range of choices:
As our provisioning requirements are relatively simple:
- deploy JVM targeted applications
- focus on application based deployments (as opposed to provisioning complex OS environments).
Credential information will also need to be setup in
resources/jclouds.conf.example for a template configuration file).
Application Modifications for Scaling
As one might hope, the actual modifications we need to make to our existing application code are quite minimal. The biggest change is related to how we provision worker nodes:
Deployment with JClouds
In order to interact with JClouds, I use class based inherritance so that the separate concerns of OS flavours, cloud vendor seasoning and client compute node salting can be addressed. The following class diagram illustrates this:
For the most part, the
Image abstract class defines how to provision compute nodes (in a vendor cloud) and then defines how to bootstrap those nodes (e.g. using a Chef client):
Provisioning with Chef
Provisioning a compute node boils down to using the bootstrap script to install and utilise a Chef client provisioner. To facilitate the use of Chef, I define mutable datastructures that allow the actual Chef runlist and attributes used to be redefined and modified in child classes:
Here we see some OS flavouring occuring to our base
Image abstract class:
and some vendor related seasoning (specifically,
Rackspace seasoning) to our
Finally, our client node can salt the vendor class to taste:
However, in order for the above to work, we first need to ensure that:
- an Enterprise Chef account has been configured
- and that, via the knife command line utility, our cookbooks have been uploaded to the Chef server.
The following shows how, given that
~/.chef has been configured appropriately, this may be achieved (N.B. I have shifted to using
sbt-native-packager in this post):
A Failed Experiment?
Should you wish to play with this code, it is available on the
experiment-2 branch of my Github repository at https://github.com/carlpulley/jclouds-cluster.
As setup here is more complex than usual, here's a short video showing:
- the deployment of compute nodes in a Amazon cloud
- JClouds calling the Chef provisioner to configure our compute nodes
- loop polling and/or scheduling nature of JClouds (a code dip is needed to correctly discriminate here)
- failure to build an Akka cluster.
However, notice that this experiment fails: the client node successfully joins the custer, but no worker nodes are observed joining the cluster. Moreover, when I
ssh into our worker node instances, I can observe that the worker actors have also failed to launch (though I can see that the actor system is correctly listening on port 2552).
A look at the upstart logs (i.e.
/var/log/upstart/cluster.log), shows that the worker actor system has attempted to contact its seed node at some private or non-routable IP address.
To better understand the issues here, we need to think about how a typical Akka message (be it a system or user message) is transmitted from the client actor to the (remote) worker actor:
So imagine that the green actor wishes to communicate a message to the red actor. Clearly, the message that the actor sends must contain information regarding the source and destination actor. When this message is sent, it will be encapsulated in some form of transport (e.g. TCP/IP) which will also have source and destination addresses defined (here we represent these using the colours grey and white). Now this presents a series of networking related hurdles:
- what happens if the destination actor binds to an address other than the white address (e.g. it actually binds to a red address)?
- what happens if the source or destination nodes use NAT (thus decoupling actor addresses and transport addresses)?
In short, what happens is that messages do not get correctly routed by the actor systems. Which in our case means that clusters do not get built and remote actors fail to launch!
One additional complication here is that actors (and specifically actor systems) need to communicate in a bidirectional manner.
So, how can we resolve these issues? One way of ensuring that actors bind to the correct (public) NICs is to specify their addresses when they are provisioned. However, NATing (particularly at the client node end) appears to be a bit of a stopper here!
So, we've seen how we can easily prepare our ping-pong application for scaling into the cloud. We've also seen how we can use JClouds and Chef to deploy and provision worker nodes within an arbitrary vendors cloud (or at least one that JClouds supports!). However, we have also uncovered a number of issues:
- defining our deployment and provisioning configurations is code heavy
- if the JClouds API blocks, then our application will also block and so cease to be reactive (
Imagemethods that return futures based on thread-pooling executors can avoid these issues)
- when NATing is used, clusters and remote Akka actor deployments need some extra TLC in order to function as we might wish or expect.