[rabbitmq-discuss] RabbitMQ failure under high load
Michał Kiędyś
michal at kiedys.net
Wed Jun 27 12:19:04 BST 2012
Dear Simon,
My tool uses company internal libraries, so I can not publish it.
Would you like to get more details of this test to be able to play it on
your own?
Regards,
MK
2012/6/27 Simon MacMullen <simon at rabbitmq.com>
> Hi Michał - please can you keep rabbitmq-discuss on CC?
>
> So as I said, the limit is only the point at which Rabbit stops accepting
> new messages. In the general case this should be enough to stop further
> memory consumption - but in your case it looks like it isn't. If you were
> able to post your test tool in a way that would make it easy for us to run,
> then that might be the easiest way for us to help you. At the moment we
> just don't have enough information.
>
> Cheers, Simon
>
>
> On 27/06/12 09:36, Michał Kiędyś wrote:
>
>> Simon,
>>
>> My question becomes from fact, that Rabbit can consume even more than
>> 4GB when limit is set to 1.6GB.
>> At this scenario raports usage at 2.7GB but real usage is more than 4GB.
>>
>> rabbit at arch-task-mq -8
>> <http://arch-task-mq-7:55672/#**/nodes/rabbit%40arch-task-mq-8<http://arch-task-mq-7:55672/#/nodes/rabbit%40arch-task-mq-8>
>> **>
>>
>> 734 / 1024
>> 701 / 829
>> 5795 / 1048576
>> 2.7GB (?)
>> _1.6GB high watermark
>> 49.6GB
>> _4.0GB low watermark 12m 33sRAM
>>
>>
>>
>> After a while kernel kills Rabbit process:
>>
>> Mem-info:
>> DMA per-cpu:
>> cpu 0 hot: high 186, batch 31 used:8
>> cpu 0 cold: high 62, batch 15 used:48
>> cpu 1 hot: high 186, batch 31 used:108
>> cpu 1 cold: high 62, batch 15 used:55
>> cpu 2 hot: high 186, batch 31 used:118
>> cpu 2 cold: high 62, batch 15 used:53
>> cpu 3 hot: high 186, batch 31 used:89
>> cpu 3 cold: high 62, batch 15 used:55
>> DMA32 per-cpu: empty
>> Normal per-cpu: empty
>> HighMem per-cpu: empty
>> Free pages: 12076kB (0kB HighMem)
>> Active:0 inactive:741324 dirty:0 writeback:9 unstable:0 free:3023
>> slab:101876 mapped:3649 pagetables:2586
>> DMA free:12092kB min:8196kB low:10244kB high:12292kB active:0kB
>> inactive:2965168kB present:4202496kB pages_scanned:32 all_unreclaimable?
>> no
>> lowmem_reserve[]: 0 0 0 0
>> DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
>> present:0kB pages_scanned:0 all_unreclaimable? no
>> lowmem_reserve[]: 0 0 0 0
>> Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
>> present:0kB pages_scanned:0 all_unreclaimable? no
>> lowmem_reserve[]: 0 0 0 0
>> HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB
>> present:0kB pages_scanned:0 all_unreclaimable? no
>> lowmem_reserve[]: 0 0 0 0
>> DMA: 172*4kB 533*8kB 170*16kB 41*32kB 11*64kB 1*128kB 1*256kB 1*512kB
>> 0*1024kB 1*2048kB 0*4096kB = 12632kB
>> DMA32: empty
>> Normal: empty
>> HighMem: empty
>> Swap cache: add 4358, delete 4243, find 0/0, race 0+0
>> Free swap = 1031136kB
>> Total swap = 1048568kB
>> Free swap: 1031136kB
>> 1050624 pages of RAM
>> 26588 reserved pages
>> 17300 pages shared
>> 83 pages swap cached
>> Out of Memory: Kill process 2213 (rabbitmq-server) score 14598295 and
>> children.
>> Out of memory: Killed process 2227 (beam.smp).
>>
>>
>>
>> This is Ok?
>>
>>
>> Regards,
>> MK
>>
>> 2012/6/22 Simon MacMullen <simon at rabbitmq.com <mailto:simon at rabbitmq.com
>> >>
>>
>>
>> Hi Michał.
>>
>> This is quite vague - if we can't see the source of your test tool
>> it's hard to see what it's actually doing.
>>
>> The server can use more memory than the high watermark; that's just
>> the point at which it stops accepting new messages from the network.
>> This should greatly cut the extent to which it can consume more
>> memory, but will not eliminate it.
>>
>> There is an existing issue where the processes used by connections
>> do not close when the connection is closed and memory use is above
>> the watermark. When the memory use drops the processes will go.
>> Could your test application be opening new connections?
>>
>> Also, you say:
>>
>>
>> The readers has been disconnected by the server ahead of time.
>>
>>
>> does this mean that huge numbers of messages are building up in the
>> server? Note that in the default configuration there is a
>> per-message cost in memory of a hundred bytes or so even when the
>> message has been paged out to disc, so that might explain why so
>> much memory is being used.
>>
>> I hope this helps explain what you are seeing. But I'm not exactly
>> sure what you are doing...
>>
>> Cheers, Simon
>>
>>
>> On 22/06/12 14:09, Michał Kiędyś wrote:
>>
>> Hi,
>>
>> Software version: 2.8.2
>> The cluster has been stressed with 1000 writers and 100 readers.
>> Message
>> size is 100kB.
>> Test configuration:
>>
>> _readers node #1_
>>
>> test.ConnectionPerWorker=true
>> test.WritersCount=0
>> test.ReadersCount=33
>> test.Durable=true
>> test.QueuesCount=1
>> test.AutoAck=false
>> test.ExchangeType=direct
>> test.QueueNamePrefix=direct
>> test.Host=arch-task-mq-7.atm
>>
>> _readers node #2_
>>
>> test.ConnectionPerWorker=true
>> test.WritersCount=0
>> test.ReadersCount=33
>> test.Durable=true
>> test.QueuesCount=1
>> test.AutoAck=false
>> test.ExchangeType=direct
>> test.QueueNamePrefix=direct
>> test.Host=arch-task-mq-8.atm
>>
>> _readers node #3_
>>
>> test.ConnectionPerWorker=true
>> test.WritersCount=0
>> test.ReadersCount=33
>> test.Durable=true
>> test.QueuesCount=1
>> test.AutoAck=false
>> test.ExchangeType=direct
>> test.QueueNamePrefix=direct
>> test.Host=arch-task-mq-8.atm
>>
>> _writers node #4_
>>
>> test.ConnectionPerWorker=true
>> test.WritersCount=333
>> test.ReadersCount=0
>> test.Durable=true
>> test.QueuesCount=1
>> test.AutoAck=false
>> test.ExchangeType=direct
>> test.QueueNamePrefix=direct
>> test.BodySize=102400
>> # available units: s(seconds), m(minutes), h(hours) d(days)
>> test.TestDuration=3h
>> test.Host=arch-task-mq-8.atm
>>
>> writers node #5
>> test.ConnectionPerWorker=true
>> test.WritersCount=333
>> test.ReadersCount=0
>> test.Durable=true
>> test.QueuesCount=1
>> test.AutoAck=false
>> test.ExchangeType=direct
>> test.QueueNamePrefix=direct
>> test.BodySize=102400
>> # available units: s(seconds), m(minutes), h(hours) d(days)
>> test.TestDuration=3h
>> test.Host=arch-task-mq-7.atm
>>
>> writers node #6
>> test.ConnectionPerWorker=true
>> test.WritersCount=334
>> test.ReadersCount=0
>> test.Durable=true
>> test.QueuesCount=1
>> test.AutoAck=false
>> test.ExchangeType=direct
>> test.QueueNamePrefix=direct
>> test.BodySize=102400
>> # available units: s(seconds), m(minutes), h(hours) d(days)
>> test.TestDuration=3h
>> test.Host=arch-task-mq-8.atm
>>
>>
>> _Actual tests state:_
>>
>> Running worker-1000w-100r-100kB
>> Preparing tests on arch-task-mq-1
>> Preparing tests on arch-task-mq-2
>> Preparing tests on arch-task-mq-3
>> Preparing tests on arch-task-mq-4
>> Preparing tests on arch-task-mq-5
>> Preparing tests on arch-task-mq-6
>> Preparations done, starting testing procedure
>> Start tests on arch-task-mq-1
>> Start tests on arch-task-mq-2
>> Start tests on arch-task-mq-3
>> Start tests on arch-task-mq-4
>> Start tests on arch-task-mq-5
>> Start tests on arch-task-mq-6
>> Waiting for tests to finish
>> Tests done on arch-task-mq-5
>> Tests done on arch-task-mq-6
>> Tests done on arch-task-mq-4
>>
>>
>> The readers has been disconnected by the server ahead of time.
>>
>>
>> _Actual cluster state (data from Management Plugin view):_
>>
>> Name File descriptors (?) Socket
>> descriptors
>> (?) Erlang processes Memory
>> Disk
>> space Uptime Type
>> (used / available)
>> (used
>> / available) (used / available)
>> rabit at arch-task-mq-7 392 / 1024 334 / 829
>> 2885 / 1048576 540.2MB
>> 49.6GB 21h 14m Disc Stats *
>>
>>
>> 1.6GB high watermark 4.0GB low watermark
>> rabbit at arch-task-mq-8 692 / 1024 668 / 829
>> 5522 / 1048576 1.8GB (?)
>> 46.1GB 21h 16m RAM
>>
>>
>> 1.6GB high watermark 4.0GB low watermark
>>
>> Number of processes is growing all the time even though no
>> messages are
>> not published or received.
>> All publishers has been blocked. After some time I killed the
>> publisher processes, but RabbitMQ still sees them as connected and
>> blocked. :)
>>
>> Some logs:
>>
>> mkiedys at arch-task-mq-8:/var/__**log/rabbitmq$ cat
>>
>> rabbit at arch-task-mq-8.log
>> |grep vm_memory_high|tail -n 20
>> vm_memory_high_watermark clear. Memory used:1709148224
>> allowed:1717986918
>> vm_memory_high_watermark set. Memory used:2135174984
>> <tel:2135174984> allowed:1717986918
>>
>> vm_memory_high_watermark clear. Memory used:1593121728
>> allowed:1717986918
>> vm_memory_high_watermark set. Memory used:2043534608
>> <tel:2043534608> allowed:1717986918
>>
>> vm_memory_high_watermark clear. Memory used:1681947128
>> allowed:1717986918
>> vm_memory_high_watermark set. Memory used:2088225952
>> <tel:2088225952> allowed:1717986918
>>
>> vm_memory_high_watermark clear. Memory used:1710494800
>> allowed:1717986918
>> vm_memory_high_watermark set. Memory used:2208875080
>> allowed:1717986918
>> vm_memory_high_watermark clear. Memory used:1713902032
>> allowed:1717986918
>> vm_memory_high_watermark set. Memory used:2122564032
>> <tel:2122564032> allowed:1717986918
>>
>> vm_memory_high_watermark clear. Memory used:1663616264
>> allowed:1717986918
>> vm_memory_high_watermark set. Memory used:2098909664
>> <tel:2098909664> allowed:1717986918
>>
>> vm_memory_high_watermark clear. Memory used:1712666136
>> allowed:1717986918
>> vm_memory_high_watermark set. Memory used:2088814360
>> <tel:2088814360> allowed:1717986918
>>
>> vm_memory_high_watermark clear. Memory used:1640273568
>> allowed:1717986918
>> vm_memory_high_watermark set. Memory used:2116966952
>> allowed:1717986918
>> vm_memory_high_watermark clear. Memory used:1715305176
>> allowed:1717986918
>> vm_memory_high_watermark set. Memory used:2186572648
>> <tel:2186572648> allowed:1717986918
>>
>> vm_memory_high_watermark clear. Memory used:1716620504
>> allowed:1717986918
>> vm_memory_high_watermark set. Memory used:2180898440
>> allowed:1717986918
>>
>> mkiedys at arch-task-mq-8:/var/__**log/rabbitmq$ cat
>>
>> rabbit at arch-task-mq-8.log
>> |grep vm_memory_high|wc -l
>> 2935
>>
>> Why does the server consumes more memory than 1.6GB limit?
>>
>> Regards,
>> MK
>>
>>
>>
>> ______________________________**___________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.__rabbi**tmq.com <http://rabbitmq.com>
>> <mailto:rabbitmq-discuss@**lists.rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
>> >
>> https://lists.rabbitmq.com/__**cgi-bin/mailman/listinfo/__**
>> rabbitmq-discuss<https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss>
>>
>> <https://lists.rabbitmq.com/**cgi-bin/mailman/listinfo/**
>> rabbitmq-discuss<https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>> >
>>
>>
>>
>> --
>> Simon MacMullen
>> RabbitMQ, VMware
>>
>>
>>
>
> --
> Simon MacMullen
> RabbitMQ, VMware
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120627/f6acb0b7/attachment.htm>
More information about the rabbitmq-discuss
mailing list