[rabbitmq-discuss] Rabbit slowing down on accepting messages

Wed Mar 19 17:51:14 GMT 2014

The queues are durable/mirrored; there are ~150 of them involved per
message.

The "140K msgs/hr" value means that a producer places a message on the
inbound queue. The application (consumers of this queue) processes the
messages using an asynchronous state machine that uses AMQP as a means of
pushing data to the next state of the processor (publishes/consumes). For
each state there are 2-3 queues such as process, retry, and error; there
are many states per message type. There are 6 application instances
connecting to the Rabbit cluster to process these messages. As a message
flows through the various states its metadata will flow through these
various queues until the processing activity is completed. We use
MassTransit and the .Net C# Rabbit driver on the application side which
implements this behavior. I mention/describe this to be as complete as
possible as we're not testing a single produce/consume type of scenario -
it's more complex as there are several processing states each
publishing/consuming to many different exchanges/queues.

CPU looked fine, Disk is fine, no latency observed - we're on a SAN capably
of 20K IOPs peak and we're not anywhere close to that.

Publishing is done in the MassTransit implementation using the .Net C#
driver, but it publishes then spawns another thread to wait for the ACK. So
the main processing keeps going, but if there is an exceptional return it
is handled elsewhere/asynchronously.

I did confirm that the entire Erlang process was stopped and restarted on
the Rabbit nodes; we will recreate the issue and try just stopping the app
and restarting. The VM's were not restarted nor were they migrated between
test runs.

Cheers,

Ron

On Wed, Mar 19, 2014 at 10:24 AM, Simon MacMullen <simon at rabbitmq.com>wrote:

> On 19/03/14 16:27, Ron Cordell wrote:
>
>> Any suggestions on places to look to see what the underlying issue might
>> be?
>>
>
> The good news is that performance bottlenecks will get easier to diagnose
> in 3.3.0. The bad news is it's not out yet.
>
> 140kmsg/h is only 40msg/s - so you should not have any difficulty hitting
> that even on modest hardware. I assume the messages were persistent as well
> as mirrored, but unless the messages were both very large and persistent
> you should not have a problem there.
>
> Was anything on the broker looking busy (CPU, disk?) Did any of the
> connections show a status of "flow"?
>
> If the answer to those questions is "no" then could you be publishing
> (effectively) synchronously? Do you use mandatory publishing, publish
> inside transactions, or use confirms in a non-streaming way (i.e. publish,
> wait for confirm, repeat)?
>
> Cheers, Simon
>
> --
> Simon MacMullen
> RabbitMQ, Pivotal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140319/1fa7e642/attachment.html>