[rabbitmq-discuss] hanging on message sending with py-amqplib

Sat Oct 2 17:15:35 BST 2010

> The output of list_connections show many connections in the 'blocked' state,
> which means the broker is refusing to accept any more data.
>
> That should only happen if the memory alarm is currently active. What does
> the log file say about that? Perhaps you could post it.

Yeah, it is in an alarm state.  The thing goes up and down like mad
for a minute or so, and then settles down with everything hung.  The
logfile is attached.

> The output of list_queues shows a number of queues with messages waiting for
> acknowledgement. One queue in particular has a high count. What's the
> average message size of messages in that queue?

it's around a hundred bytes to a few hundred bytes.  Is it strange
that the largest of the queues is using ~12MB, and the second largest
is using ~3MB, and we're hitting our 200MB high water mark?  Am I
reading that wrong, or does there seem to be a lot of memory being
used that's not in the queued messages?  Could it be one of the
tracking things, and solvable with the toke plugin?  The highest
unack'd count is 54,468, which seems high, but not insanely so.  The
other strange thing is that the process slurping from that queue
should have a prefetch of 1,000 messages.  I'll have to see what went
wrong there, but I'm guessing it's running the wrong code.

> The messages waiting for acks also show up in list_channels.
>
> Tracing the from the queue through the consumers, channels and connections
> to find the connection which is consuming from the high-ack queue, we
> discover that connection to be blocked.
>
>
> So I think what is happening here is that you have a client that is both a
> consumer and producer. At some point the memory alarm is raised. Because the
> client has been identified as a producer the server blocks the inbound
> socket data stream, thus pausing the client in its tracks. But messages
> still get delivered to the client. And they all pile up in memory, in the
> server's channel process. That is preventing the server from ever
> recovering, clearing the memory alarm and unblocking the connections.

If the client isn't also a producer (at least on that channel), but is
in transactional mode, will the high water mark prevent tx.commit
messages from being handled?

> I reckon one way to prevent this situation is to set a qos.prefetch_count,
> thus bounding the number of messages which are waiting for acknowledgement.
> Or, if you can consume in no-ack mode that would be even better.

the prefetch_count should be getting set, but doesn't appear to be.
I'll look into that.  The original machine that was having this
problem was doing it after ~300 messages were inserted into the
queues, and it had a high water mark of 800MB.  I really wish I still
had access to that machine for testing, but unfortunately that exact
situation doesn't seem to be repeatable.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mqueue at master.0.log.gz
Type: application/x-gzip
Size: 4497 bytes
Desc: not available
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20101002/7efe6d33/attachment.bin>