HornetQ Clusters – Not a Breakfast Cereal

In my last couple of blog posts, I showed you how to set up a cluster of JBoss EAP 6 servers, and how to use Apache to load balance a web application. We’re now going to look at a more exotic case of clustering: how do we cluster the HornetQ message broker?

Why?

There are a couple of reasons why you might want to cluster your messaging provider. They’re really the same reasons you’d cluster servers, namely scalability and high availability. In the case of message brokers, you may want to:

  • load balance the messages across multiple brokers;
  • provide resiliency in the event that a message broker goes down.

So, let’s first look at a simple case of setting up a HornetQ cluster, then we can test out each of the above use cases.

Nobody Does It Better

JBoss provides pretty much everything you need to create a HornetQ cluster out of the box, including the following core components:

  • a broadcast group – this is the multicast address on which a HornetQ broker advertises itself to other potential cluster members;
  • a discovery group – this is the multicast address on which a HornetQ cluster listens for other members (this is usually the same address as the broadcast group);
  • a cluster connection – a connection to a messaging cluster using a combination of the discovery group, a pre-defined HornetQ connector, and an address (a queue name) on which to load-balance messages.

The above are already configured in the standalone-full-ha.xml file, or the full-ha profile in domain.xml, as follows:

default

Currently the address we’re balancing is any ‘jms’ address, which should mean any jms queue or topic you create will be load-balanced.

As you can see, the broadcast and discovery groups above use the same socket binding (namely, ‘messaging-group’). This socket binding defines a multicast address and port that members of the cluster use to talk to each other, in much the same way as regular JBoss clustering. Also, if you check out this socket binding in the socket binding groups section, you’ll see that you can override the address and port at runtime using system properties (specifically, jboss.messaging.group.address and jboss.messaging.group.port).

socket

Haven’t Got Time for the Pain

So, to create a HornetQ cluster, we don’t need to change anything? All we need to do is start up 2 JBoss servers, each using the full-ha profile, on a network that supports multicast, right? Well, sort of. There are a couple of gotchas:

  • you need to ensure your servers are bound to interfaces other than ‘localhost’, otherwise HornetQ can’t assign a proper broadcast address;
  • you need to set the ‘cluster password’ identically for both servers. The cluster won’t work if you leave the default password in place – you’ll get a HornetQ security exception.

password

The other thing that’s a bit of a pain is that, by default in EAP 6.4, the full-ha profile has no console logging, so you won’t see any messages (good or bad!) related to HornetQ in the server console. Believe me, if you’re doing anything remotely serious with HornetQ clustering, you’ll want to see those messages! I usually copy the logging subsystem configuration from standalone.xml into standalone-full-ha.xml. The other alternative is to tail the server.log file (easy on Linux, not so easy on Windows).

Once you’ve dealt with these issues, you should get a cluster established between your 2 servers, and you should see some encouraging log output to that effect (hint: look at the hosts and port numbers in the ClusterConnectionBridge message).

Let the River Run

You can now go ahead and create a destination (I usually do this in the management console as that’s easiest) and start posting messages to queues. You can do this with any JMS-compliant application, either deployed in one of your JBoss servers or externally. Remember that if you’re using the netty (i.e. remoting) connector, you’ll need to make sure your queue’s JNDI name has the java:/jboss/exported prefix.

You should find that any messages you post to your jms destination are evenly distributed across the cluster. To check this, you can look at the message counts of the queues on each server. You can see them in the Runtime tab of the management console, but I’ve found this to be a little unreliable, and instead tend to go for the CLI:

msgcount1 msgcount2

As you can see in the above example, I sent 5 messages (ostensibly to the first node) but the cluster connection has load-balanced these messages across the cluster: hence I have 3 messages on the first node, and 2 messages on the second – magic!

I Get Along Without You Very Well

So how do we make our cluster resistant to failure? How do we ensure that if one of our servers becomes unavailable, we’re still able to access the messages it contained? Well, in HornetQ, this doesn’t happen automatically. High availability is achieved by creating a backup server, which is paired with a live server, and kicks into action as soon as the live server goes down.

Now, there are a couple of big decisions to make when considering an HA strategy in HornetQ. The first is about what kind of server topology you want to employ:

  • a dedicated topology is one in which you create a dedicated JBoss instance to house your backup HornetQ server. The JBoss instance itself is kind of a ‘shell’, acting purely as a container for HornetQ. This means you can’t deploy applications on it (which seems like a bit of a waste of a JBoss server IMHO!);
  • a collocated topology is one in which you create backup HornetQ servers in the same JBoss instances as your live servers. This is my preference, and there’s a nice symmetry to the creation of live-backup “pairs” of HornetQ servers.

Let’s look in more detail at a collocated topology. Clearly we don’t want to configure a live-backup pair in the same JBoss instance, as the minute we lose the JBoss server, we lose everything. Instead, the idea is to create a live-backup pair that straddles separate machines to support hardware failure. To do this, we create a HornetQ server in one JBoss instance, whose backup is in another, and vice versa, as illustrated here:

collocated

So, how do we configure this? It’s actually fairly simple, and involves a fair amount of cutting and pasting! Let’s say we’re going to configure one JBoss server, i.e. one “side” of our cluster: in the standalone-full-ha.xml, you would copy the first HornetQ server’s XML, and paste it underneath. Of course, there are a few rules:

  • you’ll want to give your 2 HornetQ servers different names so that they’re easily distinguishable e.g. “live” and “backup”;
  • you’ll need dedicated data directories for each server;
  • you’ll need additional socket bindings to keep the ports of each server separate;
  • you’ll need to give the backup server a different server id to the live server;
  • the live and backup servers should share a cluster connection, which usually means having identical broadcast group, discovery group and cluster connection configuration;
  • the backup server should have no JMS destinations (connection factories, queues or topics) defined on it;
  • sounds obvious, but the backup server needs to be defined as a backup!

Here’s a sample of what a backup server might look like:

backup

But configuring backup servers is only half the story. How do we ensure that, when our backup server kicks in, it is “current” i.e. its JMS destinations have the latest message state. Again, you have a choice here:

  • shared store: in this scenario, both the live and backup servers share the same data directories where the HornetQ messages are persisted;
  • replication: in this scenario, the live and backup servers perform in-memory replication, and they each keep their own separate data directories.

When choosing between shared store and replication strategies, you need to weigh up some pros and cons, such as availability of a fast SAN, possible performance degradation with in-memory replication etc. Red Hat also has some recommendations regarding shared store configuration, which you should check out

.

Fortunately, it’s not difficult to switch between them, so you can test them out and see which is best for you. I don’t have a particular preference, and I’m going to discuss both.

Coming Around Again

For shared store, as its name suggests, you need the live-backup pair to share the same file system. That means the same paging, bindings, journal and large-messages directories. In our collocated topology example, HornetQ live A on JBoss Instance 1 will need to share the same directories as HornetQ backup A on JBoss Instance 2. You also need to explicitly state that you’re using shared store.

So, on JBoss Instance 1 you might see a configuration like this:

sharedstore1

And, on JBoss Instance 2, like this:

sharedstore2

Mockingbird

For replication, it’s the exact same principle, but using what’s called a ‘backup group’ instead of a shared store. HornetQ servers that are members of the same backup group are paired, and will have their messages replicated from the live member of the group to the backup member.

A couple of other config settings which I’ve found useful for replication:

  • check-for-live-server: once the live server comes back up, a re-sync occurs and the backup server shuts back down again;
  • failover-on-shutdown: triggers immediate failover to the backup server on shutdown of live (useful for testing).

Here’s the sample config for JBoss Instance 1:

replication1

And for JBoss Instance 2:

replication2

Notice in the above configurations that the data directories differ from shared store in that they are relative i.e. there are actually 4 different file systems in play here (one for each HornetQ server). That’s not the case with shared store!

The Stuff That Dreams Are Made Of

Right, let’s test this out! My example uses replication rather than shared store, but the principle is the same.

Startup the 2 JBoss instances, and we should have a total of 4 HornetQ servers: a live and a backup on each JBoss instance. We’ll be using the CLI throughout to check:

showservers1 showservers2

If you see messages like “backup announced” and “synchronized with live-server” in the consoles, then you should be good to go.

However, unlike in our previous example, we can’t do a straightforward message count on the queue, as this won’t show up on backup servers. Instead, we need to look at the runtime queues, using the following CLI operation:

runtime

Armed with this, post some messages to your cluster, which should be load balanced across the 2 live HornetQ servers, as in our previous example:

livemsgcount1 livemsgcount2

The backup servers should not even have instantiated these queues yet:

backupmsgcount1 backupmsgcount2

Now, take down the first JBoss instance and see what happens! HornetQ live server A should now be gone, and backup server A (on the other JBoss instance) should have fired up. What’s more, it should contain 3 messages!

failover

If you re-start the JBoss instance that you shut down, you should see some console output to the effect that the live server is re-syncing with the backup. And if you check the queues, you should see that the backup server has gone down, and the live server has its 3 messages back!

failback

Et voilà – high availability messaging in HornetQ!

You’re So Vain

HornetQ is a big topic, and we’ve really just scratched the surface. There’s a lot more you can configure in terms of message distribution, persistence and even the networking protocols used in clustering. However, hopefully I’ve given you a bit of an insight into some of the capabilities. See you on another blog!