[rabbitmq-discuss] Weird Crash (91MB message over STOMP) [Reproducible]

Darien Kindlund darien at kindlund.com
Sat Aug 8 14:29:07 BST 2009

Hi Matthias,

Okay, I'll try and develop a compact version of the perl test code  
that I've got. If, for some reason, the message content actually  
matters, then you'll have to let me know the best way for me to send  
you a 91MB test file.  If you guys have no preferred method, then I'll  
use my company's default method (not email).

You'll have the test code by Monday at the latest.

-- Darien

On Aug 8, 2009, at 5:32 AM, Matthias Radestock <matthias at lshift.net>  

> Darien,
> Darien Kindlund wrote:
>> Actually, this problem is a bit worse.... apparently, when RabbitMQ
>> restarts and recovers the persister -- even persistent messages  
>> marked
>> 'ready' on OTHER durable queues are NOT retrievable by other STOMP
>> clients.... I get the same type of error in the rabbit.log:
>> =ERROR REPORT==== 8-Aug-2009::04:21:16 ===
>> STOMP Reply command unhandled: {'basic.deliver',
>>                                   <<"Q_1.manager.workers">>,
>>                                   1,
>>                                   false,
>>                                   <<"events">>,
>> <<"1.job.create.job.urls.job_alerts">>}
> Right. I think I know what the problem is, and it is indeed a bug in  
> the STOMP adapter which causes it to barf when attempting to deliver  
> any message that was recovered from the persister.
>> The unit test case for this would be:
>> 1) Create a durable exchange
>> 2) Create a durable queue
>> 3) Bind the queue to the exchange
>> 4) Make sure the queue has no consumers subscribed
>> 5) Send a small (normal) persistent message to the queue
>> 6) Crash RabbitMQ by sending a large message to a different,  
>> unrelated queue
>> 7) Kill epmd
>> 8) Restart RabbitMQ
>> 9) Verify the (normal) message still exists via rabbitmqctl
>> 10) Start STOMP consumer and attempt to subscribe to the queue
>> 11) STOMP consumer waits, RabbitMQ generates the log message, but no
>> persistent (normal) message gets delivered
> You should be able to skip steps 6 and 7, i.e. just bounce rabbit  
> normally, and still see the problem.
>> I see the 'unacknowledged messages' after a start up the STOMP
>> clients.
> *phew*. That is much more plausible, and means there is unlikely to  
> be a bug in the rabbit core.
>> So, I'm thinking the order of operations is:
>> 1) Unacknowledged messages exist on the queue
>> 2) RabbitMQ dies
>> 3) RabbitMQ starts up
>> 4) Recovery mode starts, marks all un-ack'd messages as 'ready'
>> 5) STOMP clients connect
>> 6) RabbitMQ generates the STOMP error
>> 7) I check the rabbitmqctl output, and see that there are un-ack'd  
>> messages
>> To be honest, I can't seem to replicate the issue where the STOMP
>> clients disconnect and the messages remain 'un-ack'd' -- I'm thinking
>> this error may be transient or somehow a wierd corner case.  If I  
>> ever
>> encounter that scenario again, I'll be sure to save the mnesia
>> directory at that point.
> Makes sense. There may well be a delay between the error being  
> generated and the messages being moved back into the 'ready' state,  
> particularly, say, when rabbit is busy dumping a large error message  
> to a log file.
>> I don't have a pure AMQP test client, but I'm curious if this error
>> condition exists if the large message were sent over AMQP instead of
>> STOMP...
> That we have definitely tested. And no, it doesn't cause an error.
> So, in summary, I think you have managed to uncover three bugs in  
> the STOMP adapter:
> 1) attempting to deliver messages recovered from the persister via  
> STOMP causes an error
> 2) STOMP client disconnects can result in huge error messages being  
> logged
> 3) sending large messages via STOMP causes rabbit to die
> Thanks for your help in tracking down these problems.
> I have one last request: Would it be possible for you to construct a  
> simple test case for 3? Ideally I want something along the lines of  
> "1) start a clean (i.e. no existing db) rabbit with stomp enabled,  
> 2) run this program, 3) see rabbit die". Based on your investigation  
> so far, the program in question could perhaps be as simple as  
> creating a large message and then attempting to send it over STOMP.
> Regards,
> Matthias.

More information about the rabbitmq-discuss mailing list