[rabbitmq-discuss] Confusing disk free space limit warning

Mon Sep 17 10:41:25 BST 2012

Mark,

On 17/09/12 08:57, Mark Hingston wrote:
> Below is the excerpt from the logs that shows what happened before the
> 'clear'.
>
> =INFO REPORT==== 16-Sep-2012::06:49:06 ===
> Disk free space limit now exceeded. Free bytes:888758272 Limit:1000000000
>
> =ERROR REPORT==== 16-Sep-2012::06:49:06 ===
> ** gen_event handler rabbit_alarm crashed.
> ** Was installed in alarm_handler
> ** Last event was: {set_alarm,{{resource_limit,disk,rabbit at mq1},[]}}
> ** When handler state == {alarms,
>                            {dict,6,16,16,8,80,48,
> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
>                             {{[],
>                               [[<0.1053.112>|
> {rabbit_reader,conserve_resources,[]}]],
>                               [],[],[],[],[],[],[],
>                               [[<0.1045.112>|
> {rabbit_reader,conserve_resources,[]}]],
>                               [[<0.106.101>|
> {rabbit_stomp_reader,conserve_resources,[]}]],
>                               [[<0.2522.112>|
> {rabbit_reader,conserve_resources,[]}]],
>                               [[<0.18285.113>|
> {rabbit_reader,conserve_resources,[]}]],
>                               [],[],
>                               [[<0.15164.198>|
> {rabbit_reader,conserve_resources,[]}]]}}},
>                            {dict,0,16,16,8,80,48,
> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
>                               []}}}}
> ** Reason == {'function not exported',
>                   [{rabbit_stomp_reader,conserve_resources,
>                        [<0.106.101>,disk,true]},
>                    {dict,fold_bucket,3},
>                    {dict,fold_seg,4},
>                    {dict,fold_segs,4},
>                    {rabbit_alarm,maybe_alert,4},
>                    {rabbit_alarm,handle_event,2},
>                    {gen_event,server_update,4},
>                    {gen_event,server_notify,4}]}

Ah. That's a bug in the stomp plug-in. It's been around since 2.8.3. 
Will fix.

The upshot is that from then onwards alarm handling is broken, which 
explains why the subsequent clearing of the disk alarm wasn't unblocking 
producers for you.

>> Throttling/blocking affects producers (only). Messages published by a
>> blocked producer will end up in various buffers at the client /
>> network / server and be lost when the server is restarted. That's just
>> normal TCP/IP behaviour.
>>
> Given what you've said, I'm trying to understand how to best handle this
> situation from the client. There's an existing bug relating to this
> issue here: https://github.com/celery/kombu/issues/136. However, I don't
> understand what (if any) indication the amqp client receives that it is
> being 'blocked' / rate limited by the rabbitmq server. Can you shed any
> light on that?

There is nothing unsual about "rate limiting" / throttling. Flow control 
is an integral part of TCP/IP - *all* TCP/IP connections are flow 
controlled. It's how TCP/IP deals with congested networks and slow servers.

The fact that the server is "slow" because it encountered an alarm 
condition is neither here or there - as far as the client is concerned 
it's just talking to a slow server.

So your question really comes down to how would you expect a client to 
detect and deal with a slow server / congested network.

Regards,

Matthias.