[rabbitmq-discuss] RabbitMQ perfomance testing & troubles
Alexey Slynko
slynko at brandproduction.ru
Wed Apr 9 11:55:35 BST 2008
Hi,
we have done some stress tests of rabbitmq. Our main goal was the ~10K
messages per second and maximum cpu & ram usage. Two servers with equal
configuration listed below were used for this task.
Intel SR1500AL:
Intel S5000PAL MB, 2 Quad Core Xeon 5345 2,33Ghz, 4Gb FB RAM :
$ /usr/sbin/prtdiag
System Configuration: Intel S5000PAL
BIOS Configuration: Intel Corporation S5000.86B.10.00.0085.112920071426
11/29/2007
BMC Configuration: IPMI 2.0 (KCS: Keyboard Controller Style)
==== Processor Sockets ====================================
Version Location Tag
-------------------------------- --------------------------
Intel(R) Xeon(R) CPU E5345 @ 2.33GHz CPU1
Intel(R) Xeon(R) CPU E5345 @ 2.33GHz CPU2
==== Memory Device Sockets ================================
Type Status Set Device Locator Bank Locator
------- ------ --- ------------------- --------------------
Unknown in use 1 ONBOARD DIMM_A1 Channel A
Unknown empty 2 ONBOARD DIMM_A2 Channel A
Unknown in use 1 ONBOARD DIMM_B1 Channel B
Unknown empty 2 ONBOARD DIMM_B2 Channel B
Unknown in use 5 ONBOARD DIMM_C1 Channel C
Unknown empty 6 ONBOARD DIMM_C2 Channel C
Unknown in use 5 ONBOARD DIMM_D1 Channel D
Unknown empty 6 ONBOARD DIMM_D2 Channel D
==== On-Board Devices =====================================
ATI Rage XL
Intel 82546EB Ethernet 1
Intel 82546EB Ethernet 2
ESB2 Integrated SATA Controller
NS PC87427 SIO3
==== Upgradeable Slots ====================================
ID Status Type Description
--- --------- ---------------- ----------------------------
1 available PCI Express FH Slot 1, PCI EXP x8
1 available PCI Express LP Slot 1, PCI EXP x8
10 available PCI Express I/O Module
These servers were connected with network link Ethernet 100Mb. The first
server (AMQP broker) powered by
Debian GNU/Linux Etch. Versions of installed software listed below:
# cat /etc/debian_version
4.0
# uname -sr
Linux 2.6.18-6-686-bigmem
# free
total used free shared buffers cached
Mem: 4139508 514148 3625360 0 210648 150388
-/+ buffers/cache: 153112 3986396
Swap: 4194296 0 4194296
Erlang version (stable debian distribution) :
# dpkg -l erlang\* | grep ii
ii erlang-base 11.b.2-4 Concurrent, real-time, distributed
functiona
ii erlang-nox 11.b.2-4 Concurrent, real-time, distributed
functiona
# erl
Erlang (BEAM) emulator version 5.5.2 [source] [async-threads:0]
[kernel-poll:false]
Eshell V5.5.2 (abort with ^G)
RabbitMQ version:
# apt-cache show rabbitmq-server | grep Version
Version: 1.3.0-1
We have modified startup script for handling more than 1024 tcp/connections:
--- rabbitmq-server.orig 2008-04-07 21:59:48.000000000 +0400
+++ /etc/init.d/rabbitmq-server 2008-04-04 18:05:25.000000000 +0400
@@ -24,8 +24,10 @@
set -e
cd /
+ulimit -n 65535
+
start_rabbitmq () {
set +e
su rabbitmq -s /bin/sh -c "$DAEMON start_all ${NODE_COUNT}" >
/var/log/rabbitmq/startup.log 2> /var/log/rabbitmq/startup.err
case "$?" in
Another midfications were done for normal mnesia work (errors listed below)
ERROR REPORT==== 4-Apr-2008::18:29:14 ===
Error in process <0.20713.0> on node 'rabbit at altair' with exit value:
{{case_clause,{error,{system_limit,"Cannot create an ets table for
the local transaction
store",{system_limit,[{ets,new,[mnesia_trans_store,[bag,public]]},{mnesia_tm,doit_loop,1},{mnesia_sp,init_proc,4},{
proc_lib...
and for increasing common throughput
# diff -u rabbitmq-server.orig /usr/sbin/rabbitmq-server
--- rabbitmq-server.orig 2008-04-07 22:27:26.000000000 +0400
+++ /usr/sbin/rabbitmq-server 2008-04-07 22:28:40.000000000 +0400
@@ -28,7 +28,7 @@
[ "x" = "x$NODE_IP_ADDRESS" ] && NODE_IP_ADDRESS=0.0.0.0
[ "x" = "x$NODE_PORT" ] && NODE_PORT=5672
-ERL_ARGS="+K true +A30 -kernel inet_default_listen_options
[{sndbuf,16384},{recbuf,4096}]"
+ERL_ARGS="-env ERL_MAX_ETS_TABLES 10240 -env ERL_MAX_PORTS 10240 +K
true +A300 +P512000 -kernel inet_default_listen_options
[{sndbuf,65535},{recbuf,65535},{reuseaddr,true}] -smp auto"
CLUSTER_CONFIG_FILE=/etc/default/rabbitmq_cluster.config
CONFIG_FILE=/etc/default/rabbitmq
There is a system state after rabbitmq startup:
top - 22:41:58 up 3 days, 5:10, 3 users, load average: 0.00, 0.00, 0.00
Tasks: 116 total, 1 running, 115 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 4139508k total, 572228k used, 3567280k free, 212784k buffers
Swap: 4194296k total, 0k used, 4194296k free, 188600k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31089 rabbitmq 15 0 2499m 15m 1620 S 0 0.4 0:00.29 beam.smp
30909 rabbitmq 15 0 1812 380 252 S 0 0.0 0:00.00 epmd
31085 rabbitmq 18 0 4176 1548 1080 S 0 0.0 0:00.01
rabbitmq-server
31410 rabbitmq 17 0 1808 452 360 S 0 0.0 0:00.00
inet_gethost
31411 rabbitmq 18 0 1852 580 468 S 0 0.0 0:00.00
inet_gethost
The second server (producer & consumer clients) powered by Solaris 10
07/. Versions of installed software listed below:
$ java -version
java version "1.5.0_12"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_12-b04)
Java HotSpot(TM) Server VM (build 1.5.0_12-b04, mixed mode)
Client source code attached to this message.
So, let's try to start 100 producer's and
100 consumer's. Each producer generate 10 msgs/sec (t
1000 msgs/sec at all)
$ java -jar RabbitmqFastMessagingProducerTest.jar 100 > client.log
System snapshot:
top - 22:50:39 up 3 days, 5:19, 3 users, load average: 0.51, 0.11, 0.04
Tasks: 116 total, 1 running, 115 sleeping, 0 stopped, 0 zombie
Cpu(s): 30.1%us, 5.3%sy, 0.0%ni, 64.2%id, 0.0%wa, 0.0%hi, 0.4%si,
0.0%st
Mem: 4139508k total, 590664k used, 3548844k free, 212844k buffers
Swap: 4194296k total, 0k used, 4194296k free, 188960k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31089 rabbitmq 15 0 2520m 33m 1620 S 281 0.8 0:45.94 beam.smp
30909 rabbitmq 15 0 1812 380 252 S 0 0.0 0:00.00 epmd
31085 rabbitmq 18 0 4176 1548 1080 S 0 0.0 0:00.01 rabbitmq-server
31410 rabbitmq 17 0 1808 452 360 S 0 0.0 0:00.00 inet_gethost
31411 rabbitmq 18 0 1852 580 468 S 0 0.0 0:00.00 inet_gethost
Client eat about 5-7% CPU.
Client log file listed below:
> CLIENT_PRODUCER_THREAD_ID: 63 TIME: 1207594715030 MESSAGE: 1032 TIME: 1207594715030
> CLIENT_PRODUCER_THREAD_ID: 85 TIME: 1207594715030 MESSAGE: 1031 TIME: 1207594715030
> CLIENT_PRODUCER_THREAD_ID: 62 TIME: 1207594715030 MESSAGE: 1033 TIME: 1207594715030
> CLIENT_PRODUCER_THREAD_ID: 4 TIME: 1207594715030 MESSAGE: 1031 TIME: 1207594715030
> CLIENT_CONSUMER_THREAD_ID: 80 TIME: 1207594715051 MESSAGE: 1032 TIME: 1207594715030
> CLIENT_CONSUMER_THREAD_ID: 85 TIME: 1207594715051 MESSAGE: 1031 TIME: 1207594715030
> CLIENT_CONSUMER_THREAD_ID: 63 TIME: 1207594715052 MESSAGE: 1032 TIME: 1207594715030
Ok, try to calculate delay:
> $ tail -f client.log | awk '/CLIENT_CONSUMER_THREAD_ID/ {print "DELAY: " $4-$8 " " $0}'
> DELAY: 15 CLIENT_CONSUMER_THREAD_ID: 96 TIME: 1207594667285 MESSAGE: 596 TIME: 1207594667270
> DELAY: 15 CLIENT_CONSUMER_THREAD_ID: 76 TIME: 1207594667285 MESSAGE: 595 TIME: 1207594667270
> DELAY: 16 CLIENT_CONSUMER_THREAD_ID: 3 TIME: 1207594667286 MESSAGE: 596 TIME: 1207594667270
> DELAY: 2 CLIENT_CONSUMER_THREAD_ID: 49 TIME: 1207594667292 MESSAGE: 596 TIME: 1207594667290
> DELAY: 3 CLIENT_CONSUMER_THREAD_ID: 63 TIME: 1207594667293 MESSAGE: 596 TIME: 1207594667290
> DELAY: 3 CLIENT_CONSUMER_THREAD_ID: 0 TIME: 1207594667303 MESSAGE: 594 TIME: 1207594667300
> DELAY: 3 CLIENT_CONSUMER_THREAD_ID: 33 TIME: 1207594667303 MESSAGE: 596 TIME: 1207594667300
> DELAY: 43 CLIENT_CONSUMER_THREAD_ID: 12 TIME: 1207594667303 MESSAGE: 596 TIME: 1207594667260
> DELAY: 44 CLIENT_CONSUMER_THREAD_ID: 65 TIME: 1207594667304 MESSAGE: 593 TIME: 1207594667260
> DELAY: 4 CLIENT_CONSUMER_THREAD_ID: 42 TIME: 1207594667304 MESSAGE: 595 TIME: 1207594667300
> DELAY: 5 CLIENT_CONSUMER_THREAD_ID: 45 TIME: 1207594667305 MESSAGE: 596 TIME: 1207594667300
> DELAY: 45 CLIENT_CONSUMER_THREAD_ID: 29 TIME: 1207594667305 MESSAGE: 594 TIME: 1207594667260
> DELAY: 2 CLIENT_CONSUMER_THREAD_ID: 39 TIME: 1207594667312 MESSAGE: 596 TIME: 1207594667310
> DELAY: 6 CLIENT_CONSUMER_THREAD_ID: 54 TIME: 1207594667336 MESSAGE: 597 TIME: 1207594667330
> DELAY: 7 CLIENT_CONSUMER_THREAD_ID: 85 TIME: 1207594667337 MESSAGE: 595 TIME: 1207594667330
> DELAY: 8 CLIENT_CONSUMER_THREAD_ID: 22 TIME: 1207594667338 MESSAGE: 597 TIME: 1207594667330
> DELAY: 8 CLIENT_CONSUMER_THREAD_ID: 79 TIME: 1207594667338 MESSAGE: 596 TIME: 1207594667330
>
So, the latency between 2-50 ms. It's suitable result.
We have increased thread count at the next step
> $ java -jar RabbitmqFastMessagingProducerTest.jar 500 > client.log
> $ tail -f client.log | awk '/CLIENT_CONSUMER_THREAD_ID/ {print "DELAY: " $4-$8 " " $0}'
> DELAY: 23338 CLIENT_CONSUMER_THREAD_ID: 367 TIME: 1207595791068 MESSAGE: 9 TIME: 1207595767730
> DELAY: 23210 CLIENT_CONSUMER_THREAD_ID: 417 TIME: 1207595791070 MESSAGE: 9 TIME: 1207595767860
> DELAY: 23326 CLIENT_CONSUMER_THREAD_ID: 464 TIME: 1207595791117 MESSAGE: 9 TIME: 1207595767791
> DELAY: 23147 CLIENT_CONSUMER_THREAD_ID: 481 TIME: 1207595791117 MESSAGE: 9 TIME: 1207595767970
> DELAY: 23327 CLIENT_CONSUMER_THREAD_ID: 386 TIME: 1207595791117 MESSAGE: 9 TIME: 1207595767790
> DELAY: 23197 CLIENT_CONSUMER_THREAD_ID: 469 TIME: 1207595791117 MESSAGE: 9 TIME: 1207595767920
> DELAY: 23984 CLIENT_CONSUMER_THREAD_ID: 322 TIME: 1207595791155 MESSAGE: 11 TIME: 1207595767171
> DELAY: 23307 CLIENT_CONSUMER_THREAD_ID: 350 TIME: 1207595791157 MESSAGE: 9 TIME: 1207595767850
> DELAY: 23158 CLIENT_CONSUMER_THREAD_ID: 453 TIME: 1207595791158 MESSAGE: 9 TIME: 1207595768000
> DELAY: 23258 CLIENT_CONSUMER_THREAD_ID: 306 TIME: 1207595791158 MESSAGE: 9 TIME: 1207595767900
> DELAY: 23283 CLIENT_CONSUMER_THREAD_ID: 356 TIME: 1207595791213 MESSAGE: 9 TIME: 1207595767930
> DELAY: 23294 CLIENT_CONSUMER_THREAD_ID: 256 TIME: 1207595791214 MESSAGE: 9 TIME: 1207595767920
> DELAY: 23144 CLIENT_CONSUMER_THREAD_ID: 456 TIME: 1207595791214 MESSAGE: 9 TIME: 1207595768070
> DELAY: 24063 CLIENT_CONSUMER_THREAD_ID: 112 TIME: 1207595791214 MESSAGE: 11 TIME: 1207595767151
> DELAY: 23144 CLIENT_CONSUMER_THREAD_ID: 484 TIME: 1207595791214 MESSAGE: 9 TIME: 1207595768070
Latency increased to 20-25 sec (not suitable).
System state snapshot:
top - 23:22:10 up 3 days, 5:51, 3 users, load average: 4.91, 2.89,
1.56
Tasks: 116 total, 1 running, 115 sleeping, 0 stopped, 0 zombie
Cpu(s): 45.9%us, 5.9%sy, 0.0%ni, 47.3%id, 0.0%wa, 0.1%hi, 0.8%si,
0.0%st
Mem: 4139508k total, 1155792k used, 2983716k free, 213068k buffers
Swap: 4194296k total, 0k used, 4194296k free, 196008k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31566 rabbitmq 15 0 3062m 572m 1620 S 412 14.2 12:14.58 beam.smp
30909 rabbitmq 15 0 1812 380 252 S 0 0.0 0:00.00 epmd
31562 rabbitmq 18 0 4172 1548 1080 S 0 0.0 0:00.00
rabbitmq-server
31887 rabbitmq 18 0 1808 452 360 S 0 0.0 0:00.00
inet_gethost
31888 rabbitmq 19 0 1852 576 468 S 0 0.0 0:00.00
inet_gethost
Try to increase threads up to 1000:
> $ java -jar RabbitmqFastMessagingProducerTest.jar 1000 > client.log
> $ tail -f client.log | awk '/CLIENT_CONSUMER_THREAD_ID/ {print "DELAY: " $4-$8 " " $0}'
> DELAY: 78805 CLIENT_CONSUMER_THREAD_ID: 562 TIME: 1207597161505 MESSAGE: 577 TIME: 1207597082700
> DELAY: 78843 CLIENT_CONSUMER_THREAD_ID: 602 TIME: 1207597161593 MESSAGE: 577 TIME: 1207597082750
> DELAY: 31985 CLIENT_CONSUMER_THREAD_ID: 483 TIME: 1207597161687 MESSAGE: 84 TIME: 1207597129702
> DELAY: 59055 CLIENT_CONSUMER_THREAD_ID: 694 TIME: 1207597161896 MESSAGE: 235 TIME: 1207597102841
> DELAY: 79510 CLIENT_CONSUMER_THREAD_ID: 562 TIME: 1207597162311 MESSAGE: 578 TIME: 1207597082801
> DELAY: 79512 CLIENT_CONSUMER_THREAD_ID: 602 TIME: 1207597162372 MESSAGE: 578 TIME: 1207597082860
> DELAY: 32710 CLIENT_CONSUMER_THREAD_ID: 483 TIME: 1207597162521 MESSAGE: 85 TIME: 1207597129811
> DELAY: 59680 CLIENT_CONSUMER_THREAD_ID: 694 TIME: 1207597162630 MESSAGE: 236 TIME: 1207597102950
> DELAY: 79967 CLIENT_CONSUMER_THREAD_ID: 562 TIME: 1207597162878 MESSAGE: 579 TIME: 1207597082911
> DELAY: 79953 CLIENT_CONSUMER_THREAD_ID: 602 TIME: 1207597162923 MESSAGE: 579 TIME: 1207597082970
> DELAY: 33231 CLIENT_CONSUMER_THREAD_ID: 483 TIME: 1207597163152 MESSAGE: 86 TIME: 1207597129921
Latency become unsuitable, and errors detected in client log:
> Apr 7, 2008 11:39:54 PM rabbitmqfastmessagingproducertest.FastMessagingProducer run
> SEVERE: null
> java.net.SocketException: Socket closed
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at java.io.DataOutputStream.flush(DataOutputStream.java:106)
> at com.rabbitmq.client.impl.SocketFrameHandler.writeFrame(SocketFrameHandler.java:156)
> at com.rabbitmq.client.impl.AMQConnection.writeFrame(AMQConnection.java:297)
> at com.rabbitmq.client.impl.AMQCommand.transmit(AMQCommand.java:175)
> at com.rabbitmq.client.impl.ChannelN.basicPublish(ChannelN.java:369)
> at com.rabbitmq.client.impl.ChannelN.basicPublish(ChannelN.java:350)
> at rabbitmqfastmessagingproducertest.FastMessagingProducer.run(FastMessagingProducer.java:83)
>
there are also errors in rabbitmq log:
> =WARNING REPORT==== 7-Apr-2008::23:34:04 ===
> Mnesia(rabbit at altair): ** WARNING ** Mnesia is overloaded: {mnesia_tm,
> message_queue_len,
> [732,626]}
>
>
CPU and RAM at AMQP broker server only half done.
Is there any way to increse common system throughput up to 10K msgs/sec?
Any ideas?
Thanks in advance
--
Alexey Slynko
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rabbitmqfastmessagingproducertest.tar.gz
Type: application/x-compressed-tar
Size: 1951 bytes
Desc: not available
Url : http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20080409/366f4891/attachment.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RabbitmqFastMessagingProducerTest.jar
Type: application/x-java-archive
Size: 12199 bytes
Desc: not available
Url : http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20080409/366f4891/attachment-0001.bin
More information about the rabbitmq-discuss
mailing list