Thursday, March 21, 2013

How to Decide: Single Queue or Multiple Queues?


Introduction


Typically, a Service Oriented Architecture (SOA) system has multiple and different services receiving the same type of messages. For example, a purchase order can trigger either creation or modification of an order, which may be handled by different business flows connecting to different back-end systems. Ideally, you provide to the outside world a coarse grain service for order creation, and another coarse grain service for order modification, etc, so that an edge application, such as an order capturing application, can call these coarse grain services accordingly.

But the challenges come if the only interfaces to the external edge application or applications are message queues. Immediately it begs the question: should I use the same queue or different queues for these messages, which are of the same type but intended for different services? What are the pros and cons? What are the technical challenges and best practices for each approach?

This blog post is trying to tackle these challenges. There is not a single solution that fits all situation. Rather, we will examine the pros and cons of each solution from different angles.

Edge Applications

The first angle to look at is of course the edge application that initiate the message.

Some edge application can only communicate to a single queue for various types of business transactions. In this situation you have no choice but to opt for the single queue approach. Fortunately, the message headers or content typically contains the routing information, so that the messages can be routed to the matching services downstream.

Message Routing

When you have multiple services, the first thing  you want to make sure is probably that each service only receives the messages belong to it and no cross-routing happens.

In the dedicated-queue approach, since each service use message adapter to consume messages from its own queue, there is no need to route the messages from the queues to the services.

In contrast, the shared-queue approach requires the system having the ability to route the messages to the proper services that they belong to. The routing logic have to be based on the incoming messages themselves. If there is no routing information embedded in the messages,  you can not use the shared-queue approach.

Please note that mid process receive via message adapter face the same routing challenge as the initial receive. Despite of a common misunderstanding, the correlation mechanism for mid process receive can not help the adapter to route the messages to the correct composite.

Operation Management

From the perspective of operational management complexity, the shared-queue approach has advantage over the dedicated-queue approach. Obviously, you need to manage multiple queues in a dedicated-queue approach, compared to managing single queue in a shared-queue approach. This is usually the main reason why people choose shared-queue instead of dedicated-queue.

To ease the complexity of managing dedicated-queues, you may want to name the queues to reflect the names of the services they belong to.

Both the shared-queue or dedicated-queue approaches demand you to answer the same question: should you run the message queue(s) in the same servers as the services or in its/their own servers? I would recommend to run the message queue or queues in server cluster different from the service, so that queues and services will not compete with each other for system resources. This separation also help the diagnose of performance and high-availability issues easier.

Separation of Concerns

From the perspective of separation of concerns, the dedicated queue approach has advantages over the shared queue approach.

These concerns we could separate by using dedicated-queue include:
  • Performance profile and requirements
  • High-availability requirements
  • Service release cycles
Now let's look into each of them, to see how the dedicated-queue approach addresses the separation of concerns differently from the shared-queue approach.

Performance profiles and requirements

First is performance profile and requirements. Some services may have much higher transaction volumes than others. Some services may involve much larger message size. The performance management of the message queues are very different between these performance profiles and requirements. By separating the queues, each queue can be managed individually with minimum impacts to others.

High-availability requirements

Different high availability requirements also lead to different configuration of message queues. Some services is more mission critical than others therefore need to be highly available, which requires whole server migration accompanied by disaster recovery strategy. Others may be able to sustain more down time. The requirements for the HA infrastructure underlying the message queues for these services could be very different. To separate these infrastructures could help you save resources: you only need to invest the resources and capacity to some of the message queues but not all.

Service release cycles

When services share the same message queue but have different release cycles, you need to consider whether the software release cycles of services can interfere with each other.

In the dedicated-queue approach, adding a new service requires the creation of a new queue, but will not cause interruption of the existing services. The same to renewing/redploying an existing service.

In the shared-queue approach, adding a new service or renewing an existing service could cause interruption to the existing services. These interruptions could happen in the following ways:
  • If you use message selector, the message selection criteria is built into each service independently. Adding a new service typically only involves defining the message selection criteria within the scope of the new service without impacting others. However there could be exceptional cases when you also need to update the message selection criteria of the existing services, especially if there are changes to the semantics of the messages.
  • If you use message routing service, adding a new service may also need to update the centralized routing service it can route messages to the new service. Updating the routing service always cause temporary interruption to the existing services because all services depend on it.
  • If you use message routing service, renewing an existing service in the shared-queue approach could cause interruption to other services as well. For example, if there happen to be a message in the queue that is intended for a service that happens to be brought down, the router service will have no destination to route to and encounter errors. To mitigate this risk, you may want to setup an error queue for the router service to which the router service redirect the messages.

Conclusion

By now, you should be equipped with a framework to evaluate the pros and cons of using dedicated-queue and shared-queue approach, and the best practices for each approach.