MessageBus – Using a Message Bus to Decouple Our Services and Allow Scaling

Written by Martin Führlinger, Software Engineer Backend

Why decoupling our services matters

As I already described in my last blog-post about database migration, we are applying the single-responsibility pattern to our entities and services. This helps a lot in terms of data ownership and database access, but it does not resolve the coupling of services as they still need to communicate directly. Let’s consider an example: On a beautiful day you go out for a run using the Runtastic app. Afterwards you look at your activity on runtastic.com, it should be considered in the leaderboard, and it should show your newly achieved records.

As we have one service for one purpose, we have a service responsible for the running activity, one for the leaderboard and another one for the records. Let’s ignore all the others which are taking care of your user profile, some statistics, your routes, training plans, …

As a consequence of the separation, the leaderboard service and the records service needs to be informed about the creation of the running session. The service for storing the running session could, for example, send HTTP requests to those two interested services, but this leads to a tighter coupling of the services.

That means the service responsible for the running session has to know which other services are interested in that entity. And what if one of these services is not available for some reason? We were not satisfied with that solution, so we decided to use a message bus for this type of communication instead.

How we use the message bus

In our case, we decided to use RabbitMQ®  but basically it does not really matter which message bus you use, as long as it supports the necessary publish/subscribe functionality. In RabbitMQ each message has to have a topic. Each consumer defines which routing key it listens to. It also has to define the name for its queue. E.g. the leaderboard service could listen to the routing key run_session.# and name its queue: leaderboard.run_sessions, so it would receive all messages with topics starting with “run_session.” . The topics and the queue names do not need to match, but we name the queues like this to keep a better overview. With that in mind, we decided to define our topics like this:

“<entity-type>.<crud operation>.<one other attribute>”

  • Entity-Type is for example a run_session, or a friendship, or a user.
  • CRUD operation is basically created, updated, deleted (We did not find an use-case for sending messages for “read” yet).
  • The last part depends on the entity type. For run_sessions it may be useful to have the state there (completed, live, …), for friendships it may be accepted or denied, for other entities it might be something completely different.

So e.g. “run_session.created.completed” is the topic for a run_session entity, which has been created and has the state completed. This helps us to already filter using RabbitMQ, so we get fewer delivered messages per consumer, as a service may not be interested in created friendship requests if it wants to add a feed entry for accepted friendships. In this example, the topic would be “friendship.updated.accepted” on the queue “newsfeed.friendship.updated.accepted”. RabbitMQ also supports wildcards (*, #).

As you can see in the above picture, the service responsible for an entity publishes messages according to what happened with that entity. All interested services listen to that topic on a specific queue.

Some pitfalls and basic rules we follow

During development we also encountered some pitfalls and we found some basic rules we now apply to keep it simple:

  • “Avoid loops”: e.g. records service receives an update, and adds some information to the run_session, which causes another update message.
  • “Add changed attributes”: update messages should contain the changed attributes of that entity. This increases performance for the consumer because the receiver can ignore the message if irrelevant attributes have been changed. And it also helps to avoid loops, as irrelevant changes do not lead to another message if nothing is processed.
  • “Use topics for filtering”: as already described we use topics already for filtering, but don’t add too much information there, as it can get quite complicated fairly quickly. Our pattern with CRUD and one additional attribute ended up working pretty well.
  • “Send messages near to the action”: that means send the message as near, in code and time, to the triggering action. So send an update message directly after an update. As we use the repository pattern, we only send messages with after hooks “inside” the repository using receptacle wrappers. This results in the calling business logic not knowing about messages at all.
  • “Add the service name to your queue name”, this avoids two different services not getting messages because they use the same queue name. Since RabbitMQ will only send the message to one of the queues by design . See more on this in the  RabbitMQ documentation.

Pros and Cons

Nothing is without its benefits and drawbacks, and we encountered a number of pros and cons:

Pros

  • Debugging is easier, as you can single test the publisher and the consumer and see where the problem occurs.
  • Decoupling publishers from your consumers. Basically your publishers do not need to know anything about the consumers any more.
  • You don’t block your publishers with HTTP requests to consumers.
  • Scaling has to be done on the consumer side, where the intense operation is executed, as the publisher does not have to wait for an answer, compared to an synchronous HTTP request. E.g. one publisher can feed thousands of consumers.
  • One sent message can be received by dozens of consumers, so this speeds up the publisher (one message publish vs. multiple HTTP requests).
  • Offline consumers do not block your publishers. This increases the resilience of your system.

Cons

  • Debugging can be harder too, as it adds another asynchronous component to your system.
  • You have to pay closer attention to race conditions, as it is more asynchronous than before (which you have to keep an eye on anyhow).
  • You don’t have the result of whatever happens on the consumer immediately (which is not necessary most of the time).
  • It is easy to introduce loops when updating an entity upon receiving an update message.

To sum it up, we are very happy using RabbitMQ, as it is proving itself to be very stable, reliable, and fast. It decouples our services, it speeds up our services and it keeps our system cleaner.

RabbitMQ is a trademark of Pivotal Software, Inc. in the U.S. and other countries.

***

RATE THIS ARTICLE NOW

Runtastic Tech Team We are made up of all the tech departments at Runtastic like iOS, Android, Web, Infrastructure, DataEngineering, etc. We’re eager to tell you how we work and what we have learned along the way. View all posts by Runtastic Tech Team »

Leave a Reply