[rabbitmq-discuss] RabbitMQ - big delays when establishing Kombu connection

eugene eugene.doktorov at gmail.com
Thu Sep 13 15:04:16 BST 2012


Hi guys,

We're experiencing big delays (2-25 secs) when using celery's send_task
method

Our environment:
 - client: celery 3.0.7
 - server: celery 3.0.7, rabbitmq 2.8.6
 - typical ping 
	icmp_seq=1 ttl=64 time=0.390 ms
	icmp_seq=2 ttl=64 time=0.375 ms
	icmp_seq=3 ttl=64 time=54.6 ms
	icmp_seq=4 ttl=64 time=0.384 ms
  - no memory or CPU problems are observed on client or server nodes
  - delay occurs with or without using rabbitmq cluster

I investigated problem a bit:
 - enabled Kombu output on client
 - ran tcpdump on client & server
 - was not able to make Kombu log on the server side. Tried
KOMBU_LOG_DEBUG=1 celeryd -l debug

Looking at logs I noticed that:
 - delay ALWAYS happens between the same amqp messages while establishing
Kombu connection.(TCP connection seams to be established fast)
 - server side tcpdump reflects the same delay
 - publishing, processing & receiving results are fast.

Below is client output from Kombu & tcpdump. I redirected them into the same
file:

Where to go from this point? I'd really appreciate some advice.


Thank you!

Eugene



Logs:
 - prodback1.elevate.56418 - celery client
 - 10.0.1.60.amqp - celery server (using rabbitmq) 
 - delay is between 16:30:03.623661 and 16:30:06.848230


<<LOG BEGIN>>

[Kombu connection:0x2c75350] acquired
*[Kombu connection:0x2c75350] establishing connection...*
16:30:03.623330 IP (tos 0x0, ttl 64, id 56366, offset 0, flags [DF], proto
TCP (6), length 60)
    prodback1.elevate.56418 > 10.0.1.60.amqp: Flags [S], cksum 0x167e
(incorrect -> 0x6113), seq 2858003468, win 5840, options [mss 1460,sackOK,TS
val 12218402 ecr 0,nop,wscale 7], length 0
E..<.. at .@.H>
...
..<.b.(.Y...........~.........
..p"........

16:30:03.623661 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 60)
    10.0.1.60.amqp > prodback1.elevate.56418: Flags [S.], cksum 0xcd0b
(correct), seq 1781605121, ack 2858003469, win 14480, options [mss
1460,sackOK,TS val 3557232892 ecr 12218402,nop,wscale 7], length 0
E..<.. at .@.$m
..<
....(.bj1#..Y.^M..8............
......p"....

16:30:03.623686 IP (tos 0x0, ttl 64, id 56367, offset 0, flags [DF], proto
TCP (6), length 52)
    prodback1.elevate.56418 > 10.0.1.60.amqp: Flags [.], cksum 0x1676
(incorrect -> 0x343a), seq 1, ack 1, win 46, options [nop,nop,TS val
12218402 ecr 3557232892], length 0
E..4./@. at .HE
...
..<.b.(.Y.^Mj1#......v.....
..p"....

16:30:03.624252 IP (tos 0x0, ttl 64, id 56368, offset 0, flags [DF], proto
TCP (6), length 60)
    prodback1.elevate.56418 > 10.0.1.60.amqp: Flags [P.], cksum 0x167e
(incorrect -> 0x988b), seq 1:9, ack 1, win 46, options [nop,nop,TS val
12218402 ecr 3557232892], length 8
E..<.0 at .@.H<
...
..<.b.(.Y.^Mj1#......~.....
..p"....AMQP....

*16:30:03.624547* IP (tos 0x0, ttl 64, id 13311, offset 0, flags [DF], proto
TCP (6), length 52)
    10.0.1.60.amqp > prodback1.elevate.56418: Flags [.], cksum 0x33ed
(correct), seq 1, ack 9, win 114, options [nop,nop,TS val 3557232893 ecr
12218402], length 0
E..43. at .@..u
..<
....(.bj1#..Y.....r3......
......p"

*<<my comments:
Here is the delay
always happens between these 2 amqp messages. Never in other places
In this case: 3.2 secs. Seen 25 secs.
Server side tcpdump has the same delay>>*

*16:30:06.848230* IP (tos 0x0, ttl 64, id 13312, offset 0, flags [DF], proto
TCP (6), length 299)
    10.0.1.60.amqp > prodback1.elevate.56418: Flags [P.], cksum 0x351f
(correct), seq 1:248, ack 9, win 114, options [nop,nop,TS val 3557236117 ecr
12218402], length 247
E..+4. at .@..}
..<
....(.bj1#..Y.....r5......
......p"........
.
.......capabilitiesF....	copyrightS...$Copyright (C) 2007-2012 VMware,
Inc..informationS...5Licensed under the MPL.  See
http://www.rabbitmq.com/.platformS...
Erlang/OTP.productS....RabbitMQ.versionS....2.8.6....PLAIN
AMQPLAIN....en_US.

16:30:06.848351 IP (tos 0x0, ttl 64, id 56369, offset 0, flags [DF], proto
TCP (6), length 52)
    prodback1.elevate.56418 > 10.0.1.60.amqp: Flags [.], cksum 0x1676
(incorrect -> 0x19ee), seq 9, ack 248, win 54, options [nop,nop,TS val
12221646 ecr 3557236117], length 0
E..4.1 at .@.HC
...
..<.b.(.Y..j1#....6.v.....
..|.....

16:30:06.849300 IP (tos 0x0, ttl 64, id 56370, offset 0, flags [DF], proto
TCP (6), length 175)
    prodback1.elevate.56418 > 10.0.1.60.amqp: Flags [P.], cksum 0x16f1
(incorrect -> 0xbb31), seq 9:132, ack 248, win 54, options [nop,nop,TS val
12221647 ecr 3557236117], length 123
E....2 at .@.G.
...
..<.b.(.Y..j1#....6.......
..|...........s.
.....5.libraryS....Python
amqplib.library_versionS....1.0.2.AMQPLAIN...#.LOGINS....guest.PASSWORDS....guest.en_US.

16:30:06.849577 IP (tos 0x0, ttl 64, id 13313, offset 0, flags [DF], proto
TCP (6), length 52)
    10.0.1.60.amqp > prodback1.elevate.56418: Flags [.], cksum 0x1935
(correct), seq 248, ack 132, win 114, options [nop,nop,TS val 3557236118 ecr
12221647], length 0
E..44. at .@..s
..<
....(.bj1#..Y.....r.5.....
......|.

16:30:06.849696 IP (tos 0x0, ttl 64, id 13314, offset 0, flags [DF], proto
TCP (6), length 72)
    10.0.1.60.amqp > prodback1.elevate.56418: Flags [P.], cksum 0xe14a
(correct), seq 248:268, ack 132, win 114, options [nop,nop,TS val 3557236118
ecr 12221647], length 20
E..H4. at .@..^
..<
....(.bj1#..Y.....r.J.....
......|.........
...........

16:30:06.849864 IP (tos 0x0, ttl 64, id 56371, offset 0, flags [DF], proto
TCP (6), length 72)
    prodback1.elevate.56418 > 10.0.1.60.amqp: Flags [P.], cksum 0x168a
(incorrect -> 0xe072), seq 132:152, ack 268, win 54, options [nop,nop,TS val
12221647 ecr 3557236118], length 20
E..H.3 at .@.H-
...
..<.b.(.Y..j1$^M...6.......
..|.............
...........

16:30:06.849942 IP (tos 0x0, ttl 64, id 56372, offset 0, flags [DF], proto 
TCP (6), length 68)
    prodback1.elevate.56418 > 10.0.1.60.amqp: Flags [P.], cksum 0x1686
(incorrect -> 0xae61), seq 152:168, ack 268, win 54, options [nop,nop,TS val
12221647 ecr 3557236118], length 16
E..D.4 at .@.H0
...
..<.b.(.Y..j1$^M...6.......
..|.............
.(./...

*[Kombu connection:0x2c75350] connection established:
<kombu.transport.amqplib.Connection object at 0x2d20a50>*
[Kombu connection:0x2c75350] create channel
[Kombu channel:1] exchange_declare(nowait=False, exchange='fetcher',
durable=True, arguments=None, type='direct', auto_delete=False)
[Kombu channel:1] queue_declare(passive=False, nowait=False,
exclusive=False, durable=True, queue='fetcher', arguments=None,
auto_delete=False)
[Kombu channel:1] queue_bind(queue='fetcher', arguments=None, nowait=False,
routing_key='fetcher_task', exchange='fetcher')
[Kombu channel:1]
prepare_message("(dp1\nS'retries'\np2\nI0\nsS'task'\np3\nS'fetcher.ping'\np4\nsS'kwargs'\np5\n(dp6\nsS'eta'\np7\nNsS'args'\np8\n(lp9\nsS'id'\np10\nS'610ba4f3-4fa4-46c4-a07e-2c71909a8c12'\np11\nsS'expires'\np12\nNsS'utc'\np13\nI00\ns.",
priority=None, headers={}, properties={'delivery_mode': 2},
content_type='application/x-python-serialize', content_encoding='binary')
[Kombu channel:1] basic_publish(<amqplib.client_0_8.basic_message.Message
object at 0x2d20f90>, mandatory=None, routing_key='fetcher_task',
immediate=None, exchange='fetcher')

<<LOG END>>

no delays from this point




--
View this message in context: http://rabbitmq.1065348.n5.nabble.com/RabbitMQ-big-delays-when-establishing-Kombu-connection-tp21989.html
Sent from the RabbitMQ mailing list archive at Nabble.com.


More information about the rabbitmq-discuss mailing list