[rabbitmq-discuss] Weird Crash (91MB message over STOMP) [Reproducible]

Darien Kindlund darien at kindlund.com
Sat Aug 8 08:53:44 BST 2009


Hi Matthias,

>> Okay, so after enabling verbose logging, I was able to replicate the
>> error, reliably.
>
> including the "messages show up as unacknowledged after restart" problem?

YES.  I think this problem is also STOMP specific!  After recovering
the persister from the last crash.  I start up a single STOMP client
and attempt to subscribe and get the first message off the queue.  At
that time, rabbit.log generates this error:

=INFO REPORT==== 8-Aug-2009::03:40:01 ===
accepted TCP connection on 0.0.0.0:61613 from 127.0.0.1:50113

=INFO REPORT==== 8-Aug-2009::03:40:01 ===
starting STOMP connection <0.6224.0> from 127.0.0.1:50113

=ERROR REPORT==== 8-Aug-2009::03:40:01 ===
STOMP Reply command unhandled: {'basic.deliver',
                                   <<"Q_500.manager.workers">>,
                                   1,
                                   false,
                                   <<"events">>,
                                   <<"500.job.create.job.urls.job_alerts">>}
{content,60,
         none,
... followed by the entire message contents...

Then, after disconnecting the STOMP consumer, if I try to issue a
queue.purge command, it completely fails and messages are actually
marked as 'ready' in the queue.  You should be able to replicate this
behavior (I think) with the mnesia copy in the report.tar.gz I sent
you (just now).

>> Specifically, I have a message that is approximately 91 MB in size.  A
>> perl process using Net::Stomp sends the persistent message to an
>> exchange which routes the message to a durable queue.  As soon as the
>> perl process finishes sending the message, RabbitMQ v1.6.0 completely
>> dies with _zero_ warning or logging.
>
> The fact that it dies doesn't surprise me. I can think of at least two
> possible causes for this:
>
> 1) The current persister doesn't cope with such large messages very well. If
> the message was transient you may well be ok. Did you try running the test
> with transient messages? (NB: the queues can still be durable)

Good idea, I'll re-test with transient messages and let you know if
that solves the problem.

> 2) I noticed yesterday that the disconnect of a STOMP client can cause the
> contents of a delivery attempt to be logged as part of an error message.
> Logging such a large message may well bring down the server. I have filed a
> bug to get rid of that verbose error message. Meanwhile though, can you tell
> me whether your STOMP clients are well-behaved and issue a DISCONNECT at the
> end of a session?

Yes. The STOMP clients issue a DISCONNECT at the end of the session.

> You say rabbit died with zero logging. That may well be true, but rabbit
>  *did* produce a crash dump, and that should allow us to establish the cause
> of death.

Okay, good to know.

>> I don't believe my message format matters -- just the size of the
>> message.  However, I will send the 91 MB example message in an email
>> directly to you, since I don't think everyone on the list wants to get
>> such a large attachment.  If you don't get the email containing the
>> attachment (perhaps your mailserver blocks large attachments), then
>> let me know what the best method for sending you the attachment.
>
> I didn't get the second email, though I don't think I'll need the exact
> message - as you say, the content probably doesn't matter. So just send me
> the code & logs, and the erl_crash.dump (if it's not too large; if it is
> then please send me the first few k only).

Done.  I tried sending the 91 MB file through, but it was taking way
too long to upload (63 MB compressed).

-- Darien




More information about the rabbitmq-discuss mailing list