Basic Message Bus Concepts
The use of message passing has been around since ARPA created the first packet switching computer networks. Since then, streaming protocols have become far more popular than the simpler messaging schemes and have all but eclipsed and to some point hidden the benefits of message passing. With the gaining popularity of virtualization, cloud computing and parallel processing, the resultant distributed systems are posing challenges to systems architecture which message passing and particularly the message bus have answered with great success in even the most demanding of environments.
At its core, the message bus is a facility to exchange data between components of a system in a decoupled manner. In this case, decoupled means the sender of the data need not know the location or even numbers of receivers. All the sender need know is what data to place in the message and to which group the message is to be sent. The sender then passes the message to the message broker and the message is routed to those participants interested in messages in that group.
Contrast the message bus with traditional communications schemes which require every sender of a message or data stream to know the address of every component which is interested in receiving its data. Even in those situations where there is only one receiver of a message, which in itself can be a limiting factor, managing end point addresses makes “point-to-point” communications difficult to manage, particularly in dynamic environments where the composition and number of endpoints change in a components operational life as is often the case in distributed systems.
The best analogy is that of the differences between the telephone network and that of using two-way radios. With the public telephone network, one will use a telephone number to establish a connection to a particular recipient of a message. Using the two-way radio, the sender uses a particular radio frequency on which to broadcast the message.
Concepts
It is important from the outset to define several basic terms and concepts which are core to any discussion of messaging. Even someone who has worked with a particular implementation of a message bus will need to be aware of terms used in other message bus environments if for no other reason than to translate the concepts into the vernacular of their own messaging environment.
There are a variety of message bus technologies in operation today and while many terms are shared between these technologies their use varies slightly. In many environments, a single concept is given different names and the term used depends on the implementation or the vendor providing the technology. Working across different messaging implementations then causes confusion as multiple terms will be used to refer to a single concept.
This article seeks to homogenize common concepts of the message bus into a single set of terms to facilitate discussions regardless of the implementation or vendor technology being used.
Bus Participants
Any component that sends or receives messages on the bus is called a bus participant. Bus participants can both send and receive messages or simply send or receive messages. In any case, bus participants process messages and often represent the true business logic components of a system.
Bus participants are often sub-categorized by the directionality of the messages being processed. Participants which create messages are often called message producers while participants which receive messages are called consumers.
Bus participants are quite often both producers and consumers of messages particularly in service-oriented systems where request messages are sent and the results of processing are returned to the sender of the request. In this case a bus participant acting as a client will be both a producer of request messages and a consumer of response messages.
Message
The message is the primary data unit for the message bus. Bus participants marshal data primitives through an agreed upon message construct which can take a variety of forms. Most commonly, this message construct is composed of a set of data elements grouped into two basic sections; the header and the body.
The message header contains data elements called message fields which are to be used in the processing of the message as a whole. Message headers normally contains addressing data which assist in messages routing and delivery. Header data also contain timestamps, addresses and identifiers which allow the message to be tracked and correlated in its delivery and processing at the application layer.
While application logic may access header data, it is the data in the body of the message, again in separate and addressable message fields, which is considered of particular interest to the bus participant. There is normally no need for a message unless there is a body to to be sent, at least from the perspective of the application layer. Messages without a body can be sent to satisfy some exchange protocol between participants; such an acknowledgment of delivery or processing.
Messages are often marshaled between different formats; normally that of working memory and that of the communications protocol. The format of the message as it is passed across the communications network is called the wire format and is specific to the message passing protocol or vendor technology being used.
There are various message communications protocols and wire formats in wide use today. Because each protocol has its own unique combination of strengths and weaknesses, it is common to find a variety of different message passing protocols and wire formats used in any data processing environment.
Groups
The primary addressing mechanism for message delivery is the message group. All messages are assigned to one group by the message sender. All messages which share this classification are handled in a similar fashion and are delivered to a similar set of participants. For the purposes of discussion, consider the message group simply a string of characters which acts as a message classifier.
A message group is an endpoint for a message. All that any message producer knows about the message bus is that there are groups of messages which share some logical correlation and are to be considered and processed similarly. The producer of the message need not know anything of the possible consumers of the message, how they are implemented, where they are operating or even how many consumers there are. All that is needed to be known before sending a message is the message group and the producer of the message can be sure the message will be routed to the appropriate bus participants.
Although this is a topic large enough for an article unto itself, it is useful now to mention the importance of naming governance. Because naming of message groups are (usually) determined by the message producers, it is very easy for message group names to vary widely in formatting. This results in administration issues when the use of the message bus grows and group names have to be tracked with the messages they contain and classification of the message consumers to be found on those groups.
This is complicated when group wildcards (also the subject of another article) are use to observe messages on a multiple message groups. In this case the consumer joins a message group with a name such as “sales.orders.>” and the consumer will receive messages whose groups match the wildcard. This works fine if all sales order related message groups begin with “sales.order” such as “sales.order.create” or “sales.order.update”. Development teams run into problems when new schemes are use and the consumer has to join different groups with different calls. This transfers the work from the broker to the API and the consumer because the wildcard will not match a group named “order.sales.delete”.
The naming of message groups is of strategic importance to any message bus implementation as group names are the primary address mechanism. If a message bus implementation is to be manageable as it grows, group names must be governed wisely.
Broker
Message Brokers are software components responsible for accepting messages from their clients (i.e. producers) and performing the processing necessary to ensure messages are routed and delivered to the appropriate receivers. Brokers are essentially services which accept messages for delivery to the appropriate bus participants.
Message brokers are a form of Message Transfer Agent, very much like the electronic mail (email) servers on which so much of our society depends for communications these days. A message client creates a message, opens a client connection to the mail server and send the message to the server. Once received, the mail server then processes the mail message and delivers the message to the appropriate destination. The Message Broker operates in very much the same manners, except the broker can handle complex data types and near real-time message delivery all while handling participant management and complex message routing, delivery and in some cases persistence operations to guarantee message delivery.
Most brokers operate similar to one another; they accept a connection from a bus participant and exchange messages with that participant over some protocol specific to that broker. Every broker is different and while there are standard APIs to assist the developer in connecting to and interacting with the message broker, each broker will normally require its own libraries and API. This is true even in the case where a standard messaging API is used to interact with the broker as the standard API is often simply a set of interfaces which the broker API must implement. The underlying data exchange varies between the brokers of different vendors.
There are several efforts to standardize the communications protocols between message brokers and their clients. This will greatly simplify connecting to and interacting with message brokers as a single API can be used for the different message broker implementations. Until then, developers will have to deal with supporting multiple API and communication protocols in their solutions or use one of a number of message gateway products.
Many message brokers are capable of operating in coordination with one another, creating a distributed framework of brokers which are highly resistant to failures and spread processing across the entire broker network. In this configuration, the messaging infrastructure provides highly reliable message delivery to all bus participants
Broadcast
Message brokers are responsible for the delivery of messages to bus participants which are interested in messages belonging to a particular message group. When the broker delivers messages to the interested participants, it makes copies of the message and sends a copy to all the participants. This mode of message delivery is known as “broadcast” as all interested participants receives a copy of the single message.
Message broadcasting operates like a two-way radio. The sender of a message transmits on a frequency and everyone monitoring that frequency gets a copy of the message. Broadcasting is a simple and efficient way to transmit data to multiple parties at once.
Queued
Queued delivery of a message is very similar to broadcasting in that a message producer sends a message and the message broker makes the message available to all the participants interested in messages in that group. The difference is that the broker only lets one of the participants interested in the message retrieve a copy. If no one retrieves a message, the broker holds the message until it is retrieved by a message consumer. In effect, the broker places the message in a queue where the first messages to be placed in the queue are the first ones to be retrieved by participants.
It is a queued delivery mechanism which makes systems based on a message bus so resilient to faults and horizontally scalable. By having multiple message consumers operating on a message queue, the message sender need only send the message to a message group with queued delivery, and even if one of the message consumers is busy or even failed, the other consumers will have the chance to retrieve and process the message.
In service oriented architectures, services are implemented in message consumers listening to message queues which contain service requests. Each queue represents a service and invoking a service is a matter of placing a request message in the appropriate queue.
Summary
There are a variety of concepts which are unique to message bus infrastructures. This is because the industry is used to dealing with point-to-point streaming protocols like TCP and higher level protocols like HTTP. Using message/packet based protocols like UDP and particularly those which implement message bus infrastructures have compelling advantages over streaming protocols and are finding their way into more data processing environments.
Continue reading...