[rabbitmq-discuss] Extremely uneven distribution of latencies under high load

Eugene Kirpichov ekirpichov at gmail.com
Thu Aug 4 11:14:28 BST 2011


Hi Matthias,

2011/8/4 Matthias Radestock <matthias at rabbitmq.com>:
> Eugene,
>
> On 01/08/11 12:26, Eugene Kirpichov wrote:
>>
>> I'm publishing about 6000 messages/s, each about 10kb in size, in
>> roundrobin to these 4 nodes. I have publisher confirms turned on.
>> Autoack is turned off; I have a cycle of "get, process, publish
>> result, ack". Each message takes ~800ms to process.
>
> Where in the cycle do you wait for / process the publisher confirm?
>
> Also, if you have a publishing rate of 6000Hz, but a consuming rate of 1/0.8
> = 1.25Hz, then you'll need thousands of consumers to keep up. Otherwise the
> number of messages in / disk usage of rabbit will just keep growing.
>
I do have thousands of consumers :)

>> I'm experiencing things like "rabbitmq not giving a message to a
>> consumer for several minutes", i.e. a large portion of my cluster is
>> idle waiting for messages.
>
> I suggest you use the management plug-in to get a better insight into
> what rabbit is doing. Also look at the rabbit logs for anything unusual.
>
>> RabbitMQ is also rather frequently dropping connections but I'm
>> reconnecting.
>
> RabbitMQ will only drop connections for the following reasons:
>
> - an error occurred, in which case an error message sent to the client
> (which you should see there and in the rabbit logs) and a connection
> shutdown handshake sequence is initiated
>
> - heartbeats are enabled and a client didn't send anything (including
> heartbeat frames) for a while; there should be a message in the logs for
> when that happens
>
> Any other connection drops will be due to the client, network,
> firewalls, etc.
>
Thanks, this is useful information.

>> I also graphed the publish confirmations from RabbitMQ (actually the
>> number of unack'd messages remaining)
>
> Those two are completely different things. Confirms relate to messages
> published by the client to the broker, acks relate to messages delivered
> by the broker to clients.
I did not express myself clearly enough - under "unackd" I mean
"number of messages that I have sent but not yet received a publish
confirmation about".

By the way, I had a kind of logic that tried to connect to another
queue when it had no messages on the current one.
Once I disabled this logic, this kind of behavior nearly immediately
disappeared.

So looks like in fact rabbit was having a hard time handling all the reconnects.

>
>
> Regards,
>
> Matthias.
>



-- 
Eugene Kirpichov
Principal Engineer, Mirantis Inc. http://www.mirantis.com/
Editor, http://fprog.ru/


More information about the rabbitmq-discuss mailing list