[rabbitmq-discuss] RabbitMQ failure under high load

Simon MacMullen simon at rabbitmq.com
Fri Jun 22 14:58:47 BST 2012


Hi Michał.

This is quite vague - if we can't see the source of your test tool it's 
hard to see what it's actually doing.

The server can use more memory than the high watermark; that's just the 
point at which it stops accepting new messages from the network. This 
should greatly cut the extent to which it can consume more memory, but 
will not eliminate it.

There is an existing issue where the processes used by connections do 
not close when the connection is closed and memory use is above the 
watermark. When the memory use drops the processes will go. Could your 
test application be opening new connections?

Also, you say:

> The readers has been disconnected by the server ahead of time.

does this mean that huge numbers of messages are building up in the 
server? Note that in the default configuration there is a per-message 
cost in memory of a hundred bytes or so even when the message has been 
paged out to disc, so that might explain why so much memory is being used.

I hope this helps explain what you are seeing. But I'm not exactly sure 
what you are doing...

Cheers, Simon

On 22/06/12 14:09, Michał Kiędyś wrote:
> Hi,
>
> Software version: 2.8.2
> The cluster has been stressed with 1000 writers and 100 readers. Message
> size is 100kB.
> Test configuration:
>
> _readers node #1_
> test.ConnectionPerWorker=true
> test.WritersCount=0
> test.ReadersCount=33
> test.Durable=true
> test.QueuesCount=1
> test.AutoAck=false
> test.ExchangeType=direct
> test.QueueNamePrefix=direct
> test.Host=arch-task-mq-7.atm
>
> _readers node #2_
> test.ConnectionPerWorker=true
> test.WritersCount=0
> test.ReadersCount=33
> test.Durable=true
> test.QueuesCount=1
> test.AutoAck=false
> test.ExchangeType=direct
> test.QueueNamePrefix=direct
> test.Host=arch-task-mq-8.atm
>
> _readers node #3_
> test.ConnectionPerWorker=true
> test.WritersCount=0
> test.ReadersCount=33
> test.Durable=true
> test.QueuesCount=1
> test.AutoAck=false
> test.ExchangeType=direct
> test.QueueNamePrefix=direct
> test.Host=arch-task-mq-8.atm
>
> _writers node #4_
> test.ConnectionPerWorker=true
> test.WritersCount=333
> test.ReadersCount=0
> test.Durable=true
> test.QueuesCount=1
> test.AutoAck=false
> test.ExchangeType=direct
> test.QueueNamePrefix=direct
> test.BodySize=102400
> # available units: s(seconds), m(minutes), h(hours) d(days)
> test.TestDuration=3h
> test.Host=arch-task-mq-8.atm
>
> writers node #5
> test.ConnectionPerWorker=true
> test.WritersCount=333
> test.ReadersCount=0
> test.Durable=true
> test.QueuesCount=1
> test.AutoAck=false
> test.ExchangeType=direct
> test.QueueNamePrefix=direct
> test.BodySize=102400
> # available units: s(seconds), m(minutes), h(hours) d(days)
> test.TestDuration=3h
> test.Host=arch-task-mq-7.atm
>
> writers node #6
> test.ConnectionPerWorker=true
> test.WritersCount=334
> test.ReadersCount=0
> test.Durable=true
> test.QueuesCount=1
> test.AutoAck=false
> test.ExchangeType=direct
> test.QueueNamePrefix=direct
> test.BodySize=102400
> # available units: s(seconds), m(minutes), h(hours) d(days)
> test.TestDuration=3h
> test.Host=arch-task-mq-8.atm
>
>
> _Actual tests state:_
> Running worker-1000w-100r-100kB
> Preparing tests on arch-task-mq-1
> Preparing tests on arch-task-mq-2
> Preparing tests on arch-task-mq-3
> Preparing tests on arch-task-mq-4
> Preparing tests on arch-task-mq-5
> Preparing tests on arch-task-mq-6
> Preparations done, starting testing procedure
> Start tests on arch-task-mq-1
> Start tests on arch-task-mq-2
> Start tests on arch-task-mq-3
> Start tests on arch-task-mq-4
> Start tests on arch-task-mq-5
> Start tests on arch-task-mq-6
> Waiting for tests to finish
> Tests done on arch-task-mq-5
> Tests done on arch-task-mq-6
> Tests done on arch-task-mq-4
>
>
> The readers has been disconnected by the server ahead of time.
>
>
> _Actual cluster state (data from Management Plugin view):_
> Name                   File descriptors (?)           Socket descriptors
> (?)           Erlang processes      Memory                      Disk
> space     Uptime     Type
>                                     (used / available)             (used
> / available)                  (used / available)
> rabit at arch-task-mq-7    392 / 1024                     334 / 829
>                       2885 / 1048576        540.2MB
>   49.6GB          21h 14m  Disc Stats *
>
>
>   1.6GB high watermark 4.0GB low watermark
> rabbit at arch-task-mq-8  692 / 1024                     668 / 829
>                        5522 / 1048576        1.8GB (?)
>   46.1GB          21h 16m  RAM
>
>
>   1.6GB high watermark 4.0GB low watermark
>
> Number of processes is growing all the time even though no messages are
> not published or received.
> All publishers has been blocked. After some time I killed the
> publisher processes, but RabbitMQ still sees them as connected and
> blocked. :)
>
> Some logs:
>
> mkiedys at arch-task-mq-8:/var/log/rabbitmq$ cat rabbit at arch-task-mq-8.log
> |grep vm_memory_high|tail -n 20
> vm_memory_high_watermark clear. Memory used:1709148224 allowed:1717986918
> vm_memory_high_watermark set. Memory used:2135174984 allowed:1717986918
> vm_memory_high_watermark clear. Memory used:1593121728 allowed:1717986918
> vm_memory_high_watermark set. Memory used:2043534608 allowed:1717986918
> vm_memory_high_watermark clear. Memory used:1681947128 allowed:1717986918
> vm_memory_high_watermark set. Memory used:2088225952 allowed:1717986918
> vm_memory_high_watermark clear. Memory used:1710494800 allowed:1717986918
> vm_memory_high_watermark set. Memory used:2208875080 allowed:1717986918
> vm_memory_high_watermark clear. Memory used:1713902032 allowed:1717986918
> vm_memory_high_watermark set. Memory used:2122564032 allowed:1717986918
> vm_memory_high_watermark clear. Memory used:1663616264 allowed:1717986918
> vm_memory_high_watermark set. Memory used:2098909664 allowed:1717986918
> vm_memory_high_watermark clear. Memory used:1712666136 allowed:1717986918
> vm_memory_high_watermark set. Memory used:2088814360 allowed:1717986918
> vm_memory_high_watermark clear. Memory used:1640273568 allowed:1717986918
> vm_memory_high_watermark set. Memory used:2116966952 allowed:1717986918
> vm_memory_high_watermark clear. Memory used:1715305176 allowed:1717986918
> vm_memory_high_watermark set. Memory used:2186572648 allowed:1717986918
> vm_memory_high_watermark clear. Memory used:1716620504 allowed:1717986918
> vm_memory_high_watermark set. Memory used:2180898440 allowed:1717986918
>
> mkiedys at arch-task-mq-8:/var/log/rabbitmq$ cat rabbit at arch-task-mq-8.log
> |grep vm_memory_high|wc -l
> 2935
>
> Why does the server consumes more memory than 1.6GB limit?
>
> Regards,
> MK
>
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


-- 
Simon MacMullen
RabbitMQ, VMware


More information about the rabbitmq-discuss mailing list