[rabbitmq-discuss] Having some issues with RabbitMQ

Sun Jul 18 10:43:25 BST 2010

Hi Christian,

On Sat, Jul 17, 2010 at 05:31:45PM -0700, Christian Legnitto wrote:
> I saw that and I'm not sure that is what I want. There will be users
> creating queues with scripts running locally on laptops and such. With
> your suggestion I believe this would happen:
> 
>       1. User has a local script "foo.py" running on a laptop, which
>       connects ("connection1"), gets a queue with a server-defined
>       name (call it "queue1") and reads messages
>       2. User goes to a meeting and the laptop sleeps, causing
>       RabbitMQ to close connection1
>       3. User comes back, foo.py is running but has thrown an
>       exception (or perhaps is listening on a dead connection, not
>       sure what happens here)
>       4. User restarts foo.py, getting a new connection
>       ("connection2") to a new server-generated queue ("queue2")
>       5. All messages sent between the closing of connection1 and
>       creation of connection2 never make it into queue2, so foo.py
>       possibly missed a bunch of messages
> 
> If I am mistaken, please correct me..this is all new :-)

No, you are right. But what you are asking for is nearly an impossible
situation - you'd like queues to be deleted if they're no longer used,
but you'd like them to stay around if the consumer disconnects - as far
as Rabbit is concerned, the two scenarios are the same. Thus if you
really want the queues to remain present after the clients have
disconnected (and clients which are shutdown "nicely" must remember to
delete any queues they created) then you'll have to use named queues and
revert to something like BQL to tidy up any mess left over.

> So, if the server takes care of creating the queue, there is no way
> for the client to tell it to reconnect when it comes back (and no
> queue will be there anyway as the server has cleared it). Creating a
> named queue takes write permission, correct?

No, configure permission. See
http://www.rabbitmq.com/admin-guide.html#access-control

Creating a binding to that queue requires write permission to the queue.
You could partition your namespace so that the exchanges all start
"exchange-" and the queues all start "queue-" and then you only need to
grant write permissions to "^queue-.*". That would allow your public
users to connect, create queues, bind them to the exchanges, but *not*
publish to the exchanges. (they'd also need configure permission on
"^queue-.*".

> It doesn't actually crash (bad choice of words on my part). The whole
> server is hung from the standpoint of any client I can use (python
> mainly). The BQL client won't even connect or give me a prompt at that
> point, it just hangs. amqp-utils (ruby, but may use the same lib as
> python) hangs and doesn't let me do anything. I end up stopping the
> server, clearing the data directory, and restarting it (which clearly
> wouldn't work in production). FWIW I got the same behavior with the
> old persister, which is why I thought I perhaps wasn't turning the new
> one on even though I am using the branch.

When this happens, is the disk thrashing, or is CPU use really high?
Also what type of machine / OS are you running this on?

> Yeah, I saw that flow control is generally not supported by the python
> libs (though I see http://gist.github.com/399282), but I'm not sure I
> would have hit it with ~30 msgs per second going to 10 queues.

Well, if the queues are backing up, with the old persister you would
eventually hit it. With the new persister, unless your disks are
catastrophically slow or your messages are massive, you certainly
shouldn't. How big are your messages?

> The fact that the BQL plugin stopped working was suspect. I'm not sure
> how that's written though, but I assumed it used the erlang client and
> would allow me to clear queues and get everything unblocked.

Indeed, it should, and yes, it uses the erlang client.

> So even though it isn't a crash, the behavior is the same....I can't
> read anything, publish anything, or clear queues to unblock, via
> carrot (python), amqp-utils (ruby), and the BQL plugin (erlang I
> guess). Perhaps all the libs I am using to interact with the broker
> barf with flow control?

You're right - it is pretty much indistinguishable from a crash, though
I'd love to know exactly what it's doing when it gets into this state.
The erlang client certainly does understand flow control, and I believe
the ruby client does as well.

> I also notice the status plugin says "memory (used/available) = 1498MB
> / 810MB" with the new persister...is that expected? I thought that it
> would always stay under the max and just flush to disk. Is my VM too
> wimpy?

Ahh! So that's a first clue - can you check the rabbit logs please
because there should be entries in there talking about it hitting the
high memory watermark. Certainly, that would ensure that the flow
control has been invoked. It's possible then that at least with the new
persister it's then flushing to disk, and bizarrely enough, if you keep
something like the status plugin installed and enabled and up in a web
browser, it can actually *prevent* GC from occurring, thus even when
everything has been flushed to disk, it can then take a huge amount of
time before the memory usage drops. I would (temporarily) suggest not
using the status plugin, relying on rabbitmqctl and the logs, and not
running rabbitmqctl more than once every 30 seconds. Yes, this is a
fairly severe issue with Rabbit and I'd certainly like to fix it.

Matthew