[rabbitmq-discuss] [Minimum Air Induction] Introducing Shovel: An AMQP Relay

Sat Sep 20 19:17:52 BST 2008

Valentino,

On Sat, Sep 20, 2008 at 5:45 PM, Valentino Volonghi <dialtone at gmail.com> wrote:
> A transaction per message is indeed to expensive, I've tried this and it's
> indeed
> too slow.

What difference does it make when you do multiple publishes within the same TX?

>One thing that I wasn't able to understand is what does a
> transaction
> give me in rabbitmq?

It provides an atomic barrier for sending messages - when you get the
commit.ok back from the broker, you know that the messages you have
sent have been routed to the queues that the routing key matches on.

> In the event of rabbitmq crashing I would like the whole thing to crash so
> that
> I'm sure that there won't be lines generated without being also handled.
> This
> is the embedded rabbitmq of course.

IIRC you can have the OTP application do this for you, provided
Mochiweb is packaged as an OTP application. Rabbit and Shovel are both
OTP apps themselves.

>> And if you're using an embedded RabbitMQ instance, how is the Shovel
>> application supposed to failover to other Rabbit nodes?
>
> What do you mean? I suppose shovel would have a list of backup rabbitmq
> nodes
> and would use them in the event the main one dies.

AFIAK we haven't looked into a clustered scenario using the direct
driver for the Erlang client - I suppose that it is quite possible
though.

>> Maybe you also want to compress stuff if you're sending it over a WAN.
>
> Yes, one thing that I was thinking is to just gzip the body of the message
> myself before sending it, but I haven't looked into rabbitmq to see if it
> already
> supports this feature.

Not really. Rabbit treats the payload as an opaque object.

> I suppose a solution to this would be to avoid part of this problem is to
> remove the middleman rabbitmq. There is one question I have about AMQP
> though: a durable queue and exchange don't persist messages by themselves
> right? What happens if a persistent messages enters a durable exchange
> without
> queues?

Rabbit will nuke it. Logging the message to disk is done by the queue
process, so if nothing gets routed, nothing gets persisted.

> I suppose my tests weren't too accurate then now... is a persistent message
> much
> slower than a non persistent one?

Yep. Don't know what the exact factor is though.

> Because I obtained wonderful numbers from
> messages not explicitly marked as being persistent, like 8000 messages per
> second,
> with the bottleneck being in the saturated network, on the write side of the
> connection
> and about 3-4K messages per second on the read side with the bottleneck
> being the
> python client most probably. So would these numbers confirm themselves
> pretty much
> or are they simply completely wrong?

It really depends how you set things up, but those numbers do look OK.

I think that you need to work out what ingress requirements you have
(this will be determined by the capacity of the http server) and what
egress you need (so as to avoid stuff queuing up too much).

Remember that ATM Rabbit does not implement QoS, so your egress will
be bound by the slowest consumer.

> At least I need about 2500-3000
> requests per
> second because, given the constraint with memory bound queues, the component
> should
> be as fast as the webserver otherwise the messages start to pile up.

Sounds sensible. At some stage we will get around to landing QoS and
queue overflowing.

Ben