Any links to Service Broker best practices? - sql-server-2008

I am looking for any authoritative articles on Service Broker best practices.
In particular, I am looking for the following topics (I know the answers, but have to find documents that support the knowledge):
queues in the same database
message
size
systems where the message is just a pointer and data is retrieved from tables
instrumentation - auditing Service Broker applications
TIA

systems where the message is just a pointer and data is retrieved from tables
This is not a Service Broker application, is just a queueing application. Service Broker was designed primarily for distributed applications, communication (networking, security, routing, retries) is a major component. If you only send messages as pointers and the data is in tables the distributed nature of SSB falls apart. The litmus test is "can I move my service onto another server and the application continues to work after I fix the routing?". If the answer is Yes then you're using SSB the way it was designed. If is No it meas you're only interested in queues.
The problem with using SSB as a 'dumb queue' is that is a very expensive queue (just think at the extra writes required on each message due to conversations and conversation groups). RECEIVE statement is expensive and basically a black box from optimization pov. You could optimize a table used as a queue a lot better than what you can do with an SSB service/queue. I reckon that SSB has an ace up its sleeves which makes it attractive even when used as a local queue, namely the internal activation capabilities. One may say that activation cannot be replaced with anything else (I agree, it cannot), but one must be aware of the cost and balance the pros and cons.

Related

Message queuing solution for millions of topics

I'm thinking about system that will notify multiple consumers about events happening to a population of objects. Every subscriber should be able to subscribe to events happening to zero or more of the objects, multiple subscribers should be able to receive information about events happening to a single object.
I think that some message queuing system will be appropriate in this case but I'm not sure how to handle the fact that I'll have millions of the objects - using separate topic for every of the objects does not sound good [or is it just fine?].
Can you please suggest approach I should should take and maybe even some open source message queuing system that would be reasonable?
Few more details:
there will be thousands of subscribers [meaning not plenty of them],
subscribers will subscribe to tens or hundreds of objects each,
there will be ~5-20 million of the objects,
events themselves dont have to carry any message. just information that that object was changed is enough,
vast majority of objects will never be subscribed to,
events occur at the maximum rate of few hundreds per second,
ideally the server should run under linux, be able to integrate with the rest of the ecosystem via http long-poll [using node js? continuations under jetty?].
Thanks in advance for your feedback and sorry for somewhat vague question!
I can highly recommend RabbitMQ. I have used it in a couple of projects before and from my experience, I think it is very reliable and offers a wide range of configuraions. Basically, RabbitMQ is an open-source ( Mozilla Public License (MPL) ) message broker that implements the Advanced Message Queuing Protocol (AMQP) standard.
As documented on the RabbitMQ web-site:
RabbitMQ can potentially run on any platform that Erlang supports, from embedded systems to multi-core clusters and cloud-based servers.
... meaning that an operating system like Linux is supported.
There is a library for node.js here: https://github.com/squaremo/rabbit.js
It comes with an HTTP based API for management and monitoring of the RabbitMQ server - including a command-line tool and a browser-based user-interface as well - see: http://www.rabbitmq.com/management.html.
In the projects I have been working with, I have communicated with RabbitMQ using C# and two different wrappers, EasyNetQ and Burrow.NET. Both are excellent wrappers for RabbitMQ but I ended up being most fan of Burrow.NET as it is easier and more obvious to work with ( doesn't do a lot of magic under the hood ) and provides good flexibility to inject loggers, serializers, etc.
I have never worked with the amount of amount of objects that you are going to work with - I have worked with thousands ( not millions ). However, no matter how many objects I have been playing around with, RabbitMQ has always worked really stable and has never been the source to errors in the system.
So to sum up - RabbitMQ is simple to use and setup, supports AMQP, can be managed via HTTP and what I like the most - it's rock solid.
Break up the topics to carry specific events for e.g. "Object updated topic" "Object deleted"...So clients need to only have to subscribe to the "finite no:" of event based topics they are interested in.
Inject headers into your messages when you publish them and put intelligence into the clients to use these headers as message selectors. For eg, client knows the list of objects he is interested in - and say you identify the object by an "id" - the id can be the header, and the client will use the "id header" to determine if he is interested in the message.
Depending on whether you want, you may also want to consider ensuring guaranteed delivery to make sure that the client will receive the message even if it goes off-line and comes back later.
The options that I would recommend top of the head are ActiveMQ, RabbitMQ and Redis PUB SUB ( Havent really worked on redis pub-sub, please use your due diligance)
Finally here are some performance benchmarks for RabbitMQ and Redis
Just saw that you only have few 100 messages getting pushed out / sec, this is not a big deal for activemq, I have been using Amq on a system that processes 240 messages per second , and it just works fine. I use a thread pool of workers to asynchronously process the messages though . Look at a framework like akka if you are in the java land, if not stick with nodejs and the cool Eco system around it.
If it has to be open source i'd go for ActiveMQ, and an application server to provide the JMS functionality for topics and it has Ajax Support so you can access them from your client
So, you would use the JMS infrastructure to publish the topics for the objects, and you can create topis as you need them
Besides, by using an java application server you may be able to take advantages from clustering, load balancing and other high availability features (obviously based on the selected product)
Hope that helps!!!
Since your messages are very small might want to consider MQTT, which is designed for small devices, although it works fine on powerful devices as well. Key consideration is the low overhead - basically a 2 byte header for a small message. You probably can't use any simple or open source MQTT server, due to your volume. You probably need a heavy duty dedicated appliance like a MessageSight to handle your volume.
Some more details on your application would certainly help. Also you don't mention security at all. I assume you must have some needs in this area.
Though not sure about your work environment but here are my bits. Can you identify each object with unique ID in your system. If so, you can have a topic per each event type. for e.g. you want to track object deletion event, object updation event and so on. So you can have topic for each event type. These topics would be published with Ids of object whenever corresponding event happened to the object. This will limit the no of topics you needed.
Second part of your problem is different subscribers want to subscribe to different objects. So not all subscribers are interested in knowing events of all objects. This problem statement scoped to message selector(filtering) mechanism provided by messaging framework. So basically you need to seek on what basis a subscriber interested in particular object. Have that basis as a message filtering mechanism. It could be anything: object type, object state etc. So ultimately your system would consists of one topic for each event type with someone publishing event messages : {object-type:object-id} information. Subscribers could subscribe to any topic and with an filtering criteria.
If above solution satisfy, you can use any messaging solution: activeMQ, WMQ, RabbitMQ.

Messaging Confusion: Pub/Sub vs Multicast vs Fan Out

I've been evaluating messaging technologies for my company but I've become very confused by the conceptual differences between a few terms:
Pub/Sub vs Multicast vs Fan Out
I am working with the following definitions:
Pub/Sub has publishers delivering a separate copy of each message to
each subscriber which means that the opportunity to guarantee delivery exists
Fan Out has a single queue pushing to all listening
clients.
Multicast just spams out data and if someone is listening
then fine, if not, it doesn't matter. No possibility to guarantee a client definitely gets a message.
Are these definitions right? Or is Pub/Sub the pattern and multicast, direct, fanout etc. ways to acheive the pattern?
I'm trying to work the out-of-the-box RabbitMQ definitions into our architecture but I'm just going around in circles at the moment trying to write the specs for our app.
Please could someone advise me whether I am right?
I'm confused by your choice of three terms to compare. Within RabbitMQ, Fanout and Direct are exchange types. Pub-Sub is a generic messaging pattern but not an exchange type. And you didn't even mention the 3rd and most important Exchange type, namely Topic. In fact, you can implement Fanout behavior on a Topic exchange just by declaring multiple queues with the same binding key. And you can define Direct behavior on a Topic exchange by declaring a Queue with * as the wildcard binding key.
Pub-Sub is generally understood as a pattern in which an application publishes messages which are consumed by several subscribers.
With RabbitMQ/AMQP it is important to remember that messages are always published to exchanges. Then exchanges route to queues. And queues deliver messages to subscribers. The behavior of the exchange is important. In Topic exchanges, the routing key from the publisher is matched up to the binding key from the subscriber in order to make the routing decision. Binding keys can have wildcard patterns which further influences the routing decision. More complicated routing can be done based on the content of message headers using a headers exchange type
RabbitMQ doesn't guarantee delivery of messages but you can get guaranteed delivery by choosing the right options(delivery mode = 2 for persistent msgs), and declaring exchanges and queues in advance of running your application so that messages are not discarded.
Your definitions are pretty much correct. Note that guaranteed delivery is not limited to pub/sub only, and it can be done with fanout too. And yes, pub/sub is a very basic description which can be realized with specific methods like fanout, direct and so on.
There are more messaging patterns which you might find useful. Have a look at Enterprise Integration Patterns for more details.
From an electronic exchange point of view the term “Multicast” means “the message is placed on the wire once” and all client applications that are listening can read the message off the “wire”. Any solution that makes N copies of the message for the N clients is not multicast. In addition to examining the source code one can also use a “sniffer” to determine how many copies of the message is sent over the wire from the messaging system. And yes, multicast messages are a form the UDP protocol message. See: http://en.wikipedia.org/wiki/Multicast for a general description. About ten years ago, we used the messaging system from TIBCO that supported multicast. See: https://docs.tibco.com/pub/ems_openvms_c_client/8.0.0-june-2013/docs/html/tib_ems_users_guide/wwhelp/wwhimpl/common/html/wwhelp.htm#context=tib_ems_users_guide&file=EMS.5.091.htm

Choices of Message Queue?

We've been using SysV Message Queue for our distributed data processing system for over 15 years. For some reason, we want to replace it with newer Message Queue mechanism. Is there any suggestions?
Requirements:
Fast response, minimizing message queue system overhead
Multiple client language library support, mainly c, c# and java
Can do some HA configuration to prevent SPOF
Have logging ability to check who sends message and who receives message
I've found Apache ActiveMQ and RabbitMQ, but it seems RabbitMQ lacks of stable C client library support?
While I have not used it personally, the toolkit from 0MQ is quite impressive.
It seems to meet all of your criteria, although #4 you would have to implement yourself, but that seems straightforward.
My question back would be why you are moving away from SysV Message Queue? The "for some reason" is a disconcerting statement.
That said, there are many excellent messaging products out there, having a useful set of selection criteria is key.
I would suggest extending your requirements list a bit, then doing website bench-marking against that list. Take the top two or three only, and do some real-world project spikes (or a bake-off if you prefer the term) to give you some actual feedback on which to base your final decision.
Good Luck

Can someone explain an Enterprise Service Bus to me in non-buzzspeak?

Some of our partners are telling us that our software needs to interact with an Enterprise Service Bus. After researching this a bit, my instinct is to say that this is just buzz speak for saying that we need to have a platform-indpendent way to pass messages back and forth. I'm just trying to get a feel for what our partners are telling us. Am I correct in dismissing our partners' request as just trying to get our software to be more buzzword-compliant, or are they telling us something we should listen to (even if encoded in buzzspeak)?
Although ESB is based on messaging, it is not "just" messaging and not just a buzzword.
So if you start with plain old async messaging, the early networks tended to be very point-to-point. You had to wire up (i.e. configure through some admin interface) each connection and each pair of destinations and if you dared to move anything around invariably something broke. Because the connection points were wired by hand these networks never achieved high connection density. The incremental cost was too high and did not scale. There was also a lot of access control and policy embedded in the topology. The lack of connection density actually favors this approach to security, even though it inhibits flexibility.
The ESB attempts to address these issues with...
Run-time resolution of destinations/services/resources
Location transparency
Any-to-any connectivity and maximum connection density
Architected for redundancy, horizontal scalability, failover
Policy, access control, rules externalized from topology
Logical messaging network layer implemented atop the physical messaging network layer
Common namespace
So when your customer asks for ESB compatibility, they want things like the above. From an application standpoint, this also implies...
Avoiding message affinities such as requirements to process in strict sequence or to address requests only to specific nodes instead of to a generic network destination
Ability to resolve destinations dynamically at run time (i.e. add another instance of a queue and it automatically starts getting traffic, delete one and traffic routes to the remaining nodes)
Requestor and provider apps decoupled from knowing where each other "lives". Requestor makes one connection, regardless of how many services it might need to call
Authorize by policy rather than by topology
Service provider apps able to recognize and handle dupes (as per JMS spec, see "functional duplicate" due to session handling)
Ability to run multiple active instances of a service provider application
Instrument the service provider applications so you can inquire on the status of the network or perform a test without sending an actual transaction
On the other hand, if your client cannot articulate these things then they may just want to be able to check a box that says "works with the ESB."
I'll try & keep it buzzword free (but a buzz acronym may creep in).
When services/applications/mainframes/etc... want to integrate (so send messages to each other) you can end up with quite a mess. An ESB hides that mess inside of itself (or itselves) so that an organisation can pretend that there isn't a mess and that it has something manageable. It then wraps a whole load of features around this to make this box even more enticing to the senior people in an organisation who'll make the decision to buy such an expensive product. These people typically will want to introduce a large initiative which costs a lot of money to prove that they are 'doing something' and know how to spend large amounts of money. If this is an SOA initiative then vendors various will have told them that an ESB is required to make the vendors vision of what SOA is work (typically once the number of services which they might want passes a trivial number).
So an ESB is:
A vehicle for vendors to make lots of money;
A vehicle for consultants to make lots of money;
A way for senior executives (IT Directors & the like) to show they can spend lots of money;
A box to hide a mess in;
A total PITA for a technical team to work with.
After researching this a bit, my
instinct is to say that this is just
buzz speak for saying that we need to
have a platform-indpendent way to pass
messages back and forth
You are correct, partially because the term ESB is always nice word that fits well with another buzzword, legitimate or not - which is governance (i.e. helps you manage who is accessing your endpoints and reporting metrics - Metrics btw is what all the suits like to see, so that may be a contributor)
Another reason they might want a platform neutral device is so that any services they consume are always exposed as endpoints from a central location, instead of a specific machine resource. The ESB makes the actual physical endpoints of your services irrelevant to them, which they shouldn't care much about anyway, but that enables you to move services around however they will only consume the ESB Endpoint.
Apart from a centralized repository for Discovery, an ESB also makes side by side versioning of services easier. If I had a choice and my company had the budget, we would have purchased IBM's x150 appliance :(
Thirdly, a lot of more advanced buses, like SoftwareAG's product if I recall, is natively able to expose legacy data, like from data sitting on main frames as services without the need for coding via adapters
I don't know if their intent is to leverage all the benefits an ESB provides, or as you said, make it buzzword compliant.
After researching this a bit, my instinct is to say that this is just buzz speak for saying that we need to have a platform-indpendent way to pass messages back and forth
That's about right. Sometimes an ESB will go a little bit further and include additional features like message delivery guarantees, confirmation/acknowledgement messages, and so on. The presence of an ESB also usually explicitly or implicitly creates a new protocol where none previously existed, which is another important consideration. (That is, some sort of standard or interface has to be set regarding the format of the messages.)
Am I correct in dismissing our partners' request as just trying to get our software to be more buzzword-compliant, or are they telling us something we should listen to (even if encoded in buzzspeak)?
You should always listen to your customers, even if it initially sounds silly. It's usually worth at least spending the effort to decide what's going on. Reading between the lines, what your partners probably mean is that they want a way for your service to integrate more easily with their own services and products.
An enterprise service bus handles the messaging between systems in a standard way. This allows you to communicate with the bus in the same exact way across all your platforms and the bus handles the actual translating to individual communication mechanism needed for the specific endpoint. This means you write all your code to talk to the bus using a common messaging scheme and the bus handles taking your common scheme and translating it so the endpoint understands it.
The simplest explanation is to explain what it provides:
For many years companies acquired different platforms and technologies to achieve specific functions in their business from Finance to HR. These systems needed to talk to each other to share data so middleware became the glue that allowed them to connect. Before the business knew it, they were paying for support and maint on each of these systems and the middleware. As needs in the business changed departments decided to create their own custom solutions to address special needs rather than try to make the aging solutions flexible enough to meet their needs. Before they knew it, they were paying to support and maintain the legacy systems, middleware, and custom solutions. With new laws like Sarbanes Oxley, companies need to have better information available for reporting purposes. A single view requires that they capture data from all of the systems. In addition, CIOs are now being pressured to lower costs and increase customer service. One obvious solution is the eliminate redudant systems, expensive support and maint contracts, and high cost legacy solutions which require specialists to support. Moving to a new platform allows for this, but there needs to be a transition. There are no turnkey solutions that can replicate what the business does. To address the needs for moving information around they go with SOA because it allows for information access through a generic entity. If I ask for AllEmployees from the service bus it gets them whether it is from 15 HR systems or 1. When the 15 HR systems becomes 1 system the call and result does not change, just how it was done behind the scenes. The Service Bus concept standardizes the flow of information and allows IT managers to conduct transitions behind the bus with no long term effect on upstream users.

Does RabbitMq do round-robin from the exchange to the queues

I am currently evaluating message queue systems and RabbitMq seems like a good candidate, so I'm digging a little more into it.
To give a little context I'm looking to have something like one exchange load balancing the message publishing to multiple queues. I don't want to replicate the messages, so a fanout exchange is not an option.
Also the reason I'm thinking of having multiple queues vs one queue handling the round-robin w/ the consumers, is that I don't want our single point of failure to be at the queue level.
Sounds like I could add some logic on the publisher side to simulate that behavior by editing the routing key and having the appropriate bindings in place. But that's kind of a passive approach that wouldn't take the pace of the message consumption on each queue into account, potentially leading to fill up one queue if the consumer applications for that queue are dead.
I was looking for a more pro-active way from the exchange entity side, that would decide where to send the next message based on each queue size or something of that nature.
I read about Alice and the available RESTful APIs but that seems kind of a heavy duty solution to implement fast routing decisions.
Anyone knows if round-robin between the exchange the queues is feasible w/ RabbitMQ then? Thanks.
Exchanges are generally stateless in the AMQP model, though there have been some recent experiments in stateful exchanges now that there's both a system for managing RabbitMQ plugins and for providing new experimental exchange types.
There's nothing that does quite what you want, I don't think, though I'm not completely sure I understand the requirement. Aside from the single-point-of-failure point, would having a single queue with workers reading from it solve your problem? If so, then your problem reduces to configuring RabbitMQ in an HA configuration that permits you to use that solution. There are a couple of approaches to doing that: either use HALinux and a shared store to get active/passive HA with quick failover, or set up more than one parallel broker and deduplicate on the client, perhaps using redis or similar to do so.
I suggest asking your question again on the rabbitmq-discuss mailing list, where more people will be able to offer suggestions, and where the discussion can be archived for posterity.
Agree with Tony on the approach.
Here is a 'mashup' of RabbitMQ, Redis that you could use instead of rolling your own -
http://xing.github.com/beetle/
One built in way you can do a form of sharing a form exchange to queues, but not exactly round robin, is Consistent Hashing. rabbitmq_consistent_hash_exchange
How too
https://medium.com/#eranda/rabbitmq-x-consistent-hashing-with-wso2-esb-27479b8d1d21
Paper to explain, it puts queues at a weighted distribution on a circle and then by sending random routing key it will send to the closest queue.
http://www8.org/w8-papers/2a-webserver/caching/paper2.html