[rabbitmq-discuss] Problems with rabbit and hoping I can get some help
Jerry Kuch
jerryk at vmware.com
Mon Feb 13 15:47:08 GMT 2012
Hi, Dathan:
If you're not seeing the attendant memory alarms in your Rabbit logs,
then the state you're describing isn't the TCP back-pressure we
speculated about earlier.
When you get into one of the states where you say Rabbit is "blocked,"
what happens if you:
1. Try to create a new connection (perhaps using another client other
then the Python one you're using)
2. Just trying telnetting to port 5672 on the broker, and banging out
some junk. Does the broker answer the incoming telnet connection?
Does it puke an AMQP error promptly to your console?
3. Does the broker respond when you use rabbitmqctl to ask it
questions about it's state?
4. Are you running the management plugin? If so, is the broker
healthy enough to bring up the management web UI and let you poke
around?
Depending on the answer to these, I might wonder if we've got a
client-side problem, either in the library you're using, or perhaps
a quirk in how it's being used...
Best regards,
Jerry
----- Original Message -----
From: "Dathan Pattishall" <dathan at schoolfeed.com>
To: "Jerry Kuch" <jerryk at vmware.com>
Cc: rabbitmq-discuss at lists.rabbitmq.com
Sent: Tuesday, January 31, 2012 1:19:29 PM
Subject: Re: [rabbitmq-discuss] Problems with rabbit and hoping I can get some help
CPU is rather Idle,
6% CPU system on average
8% CPU User on average
No CPU wait IO
so plenty of CPU / plenty of Memory
On Tue, Jan 31, 2012 at 1:13 PM, Jerry Kuch < jerryk at vmware.com > wrote:
Hi, Dathan...
For dropping messages, you might consider setting message TTLs, but that
may not give you quite what you want in all cases.
What does the CPU consumption of your Rabbit node look like when you're
seeing these pauses? If you wait, do they relent, with things getting
moving again?
Best regards,
Jerry
----- Original Message -----
From: "Dathan Pattishall" < dathan at schoolfeed.com >
To: "Jerry Kuch" < jerryk at vmware.com >
Cc: rabbitmq-discuss at lists.rabbitmq.com
Sent: Tuesday, January 31, 2012 1:05:39 PM
Subject: Re: [rabbitmq-discuss] Problems with rabbit and hoping I can get some help
Hi Jerry,
I neglected to mentioned that I am not hitting the memory-based flow control according to my memory alarm stat from
rabbitmqadmin.py list nodes
What threshold would the TCP back pressure logic hit? I assume when the memory limit is reached rabbit pushes back on the publishers? Rabbit did not use more then 2G of RAM out of the 5G allowed for it, if that is the case.
Also is there a way to tell rabbit drop the messages instead of block?
On Tue, Jan 31, 2012 at 12:56 PM, Jerry Kuch < jerryk at vmware.com > wrote:
Hi, Dathan...
What are your consumers doing with the published messages? If you use rabbitmqctl or
the management plugin to look at what's going on in your queues, do you see messages
accumulating but not being delivered? Or delivered but not ACKed? If messages are
building up (either undelivered or unACKed) faster than consumers are draining them,
you might be hitting memory-based flow control, which will use TCP back pressure to
stop the publishers.
See here for more information:
http://www.rabbitmq.com/memory.html
To get an idea whether this is happening to you, check out your queue contents as
suggested above, and see if memory alarms are being set in your rabbit logs.../
Best regards,
Jerry
----- Original Message -----
From: "Dathan Pattishall" < dathan at schoolfeed.com >
To: rabbitmq-discuss at lists.rabbitmq.com
Sent: Tuesday, January 31, 2012 12:52:55 PM
Subject: [rabbitmq-discuss] Problems with rabbit and hoping I can get some help
Let me first describe my setup.
root at webnode1]# rabbitmqadmin.py show overview
+--------------------+-----------------+--------------------+------------------+
| management_version | node | statistics_db_node | statistics_level |
+--------------------+-----------------+--------------------+------------------+
| 2.7.1 | rabbit at webnode1 | rabbit at webnode1 | fine |
+--------------------+-----------------+--------------------+------------------+
Rabbit MQ's producers comes from PHP 5.3.8 http://www.php.net/manual/en/book.amqp.php . Each apache process could produce a rabbit message, I am producing around 1000 messages a second on c1.xtralarge instance at ec2.
My erlang version is
/usr/local/bin/erl -v
Erlang R15B (erts-5.9) [source] [64-bit] [smp:8:8] [async-threads:0] [hipe] [kernel-poll:false]
The PROBLEM:
After about 40 mins of rabbit accepting messages all connections block causing a rather bad error on the front ends killing traffic. Turning rabbit off and restarting the web servers forces a recovery.
Stats from Rabbit:
Roughly 5000 queues are made
Roughly 3600 exchanges are made
Each exchange can have at most 1200 queues bound to it.
Each Queue is setup for autodelete and so is the exchanges with delivery type 1.
All data passed is JSON
The consumer is NODE and its keeping up with the consumption
RabbitMQ memlimit is around 5.3G
RabbitMQ mem used hits around 1.9G when it freezes produces
RabbitMQ proc used hits around 220K
RabbitMQ fd_total is 50K
RabbitMQ socks_total is around 45K and Socks used is 4K
mem_ets hists 100M // not sure what this is
Any idea what is going on? What limit am I hitting? Why does RabbitMQ block? How can I detect that I am about to hit a block state? Any suggestions or request of additional data would be great.
_______________________________________________
rabbitmq-discuss mailing list
rabbitmq-discuss at lists.rabbitmq.com
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
More information about the rabbitmq-discuss
mailing list