[rabbitmq-discuss] Java Client Changes
rob at rabbitmq.com
Thu Oct 14 23:08:30 BST 2010
Thanks for taking the time to look at this and write up your thoughts. Your input is most appreciated.
On 14 Oct 2010, at 18:02, Holger Hoffstaette wrote:
> Hi Rob,
> thanks for taking the time to look into this. Sorry that I haven't
> commented earlier but I moved and had no internet/mail/list access etc.
> On Tue, 12 Oct 2010 14:39:27 +0100, Rob Harrop wrote:
>> * Dispatch to consumers now happens on a separate thread from the
>> Connection thread
> This is good. :)
>> * Each Channel has its own dispatch thread
> This is unfortunately not so good and pretty much what I was hoping to
> avoid. It can lead to excessive context switching (the #1 enemy), and with
> a large number of Channels to really large memory use.
> I've looked at the code and understand that ChannelDispatcher will
> not deadlock when a consumer calls back with Channel ops, since it runs in
> the same thread. However, this is "recursive" in the sense that call
> chains that are too deep could blow the stack -
I might be misunderstanding what you mean here but the ConsumerDispatcher is not self-recursive. Importantly, no two dispatch calls are ever nested on the same stack. Consider what might happen if you initiate basicConsume on queue1 and the consumer gets back a handleConsumeOK. In that call back the Consumer sends a message to queue1 and gets a handleDelivery callback. The stack in both callbacks is the same depth. I've posted an example here: http://pastebin.com/VVZaTb5T to demonstrate this.
> unlikely though.
> I know that this is unavoidable since Java does not have real
> continuations, but if that's the acceptable tradeoff then we might as well
> just pull a Thread from an arbitrary pool, note that in the dispatcher and
> do any channel ops. Any callbacks from a consumer on that thread should
> now be able to re-enter the dispatcher. When all is done just clear the
> current thread from the dispatcher and pop out.
> That way we can have a freely configurable ExecutorService and less
> threads with no deadlock on callback.
I've managed to convince myself that given the ExecutorService abstraction, it is simply not really feasible to allow for an arbitrary ExecutorService to be injected. I've written a bit about this in a past post, but looking back it wasn't all that clear or detailed. Hopefully you can forgive the detail that is about to spew forth :)
Most importantly we should agree on some constraints:
1. Dispatches to Consumers should happen in the order they arrive at the Connection
2. Completion of dispatch to a given Consumer should happen-before the start of subsequent dispatch to that Consumer (this implies an as-if-serial execution for dispatches to a Consumer)
It's also important to remember the limitations of the ExecutorService abstraction. In the general case, the size of the thread pool is neither fixed nor known to the outside world. As such, the usage model tends to be to submit one Runnable per discrete task, rather than one Runnable per thread (because we don't know how many threads there are).
This maps to submitting one task per dispatch to the ExecutorService. There are two problems with this. Firstly, there is the matter of enforcing constraint 2. We must lock each Consumer so that we can only dispatch one callback at a time. If the ES queue has n tasks for Consumer a at the front of its work queue, and n is greater than the number of threads, then the ES will stall and other, potentially executable, callbacks will sit in the queue.
Secondly, is the issue of enforcing ordering. Consider 2 callbacks c1 and c2, both for the same Consumer. The ES queue is [c1,c2]. The ES has 2 threads. Both threads can execute, one picking c1 and the other c2. The ExecutorService alone _does not_ provide guarantees as to the execution order here. To do this, we could give each callback a sequence number and ensure that only the next callback in sequence can proceed. This is not too simple. A typical wait/notify solution here can churn for some time when there are many callbacks all vying to execute.
The real solution to ordering is to actually schedule a proxy task in the ES for each dispatcher, and maintain a queue of callbacks. When a proxy executes, it reserves the dispatcher, executes the head of the queue and unreserves the dispatcher. This works for ordering guarantees but doesn't prevent stalls. To prevent stalls, we need to introduce some kind of context stealing, and this is quite a meaty thing to do correctly.
None of this is insurmountable, but I'd love to see a real scenario where it would make a difference worth the effort and pain to implement.
I hope this makes sense. if you think I'm over thinking this or that there is a glaringly simple solution that I'm missing please do say. I'd love to be able to make the ExecutorService configurable in a safe and predictable fashion.
> I should probably add that I just made that up and haven't tested it yet,
> but maybe you find the idea interesting. :-)
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
More information about the rabbitmq-discuss