[rabbitmq-discuss] Channel crashes after basic.cancel_ok.

Edwin Fine rabbitmq-discuss_efine at usa.net
Mon May 12 20:56:59 BST 2008


I was monitoring my running system, and unless I am mistaken, it looks as if
amqp_network_connection:reader_loop/4 has a non-tail recursive call, which
is making it eat stack and heap like there's no tomorrow. There's a call to
gen_tcp:close(Sock) at the end of the function. I suggest you move this to
the last line of start_reader as shown:

start_reader(Sock, FramingPid) ->
    process_flag(trap_exit, true),
    put({channel, 0},{chpid, FramingPid}),
    {ok, Ref} = prim_inet:async_recv(Sock, 7, -1),
    reader_loop(Sock, undefined, undefined, undefined),
*    gen_tcp:close(Sock).*

reader_loop(Sock, Type, Channel, Length) ->
        {inet_async, Sock, _, {ok, <<Payload:Length/binary,?FRAME_END>>} }
            case handle_frame(Type, Channel, Payload) of
                closed_ok ->
                _ ->
                    {ok, Ref} = prim_inet:async_recv(Sock, 7, -1),
                    reader_loop(Sock, undefined, undefined, undefined)
        {inet_async, Sock, _, {ok, <<_Type:8,_Channel:16,PayloadSize:32>>}}
            {ok, Ref} = prim_inet:async_recv(Sock, PayloadSize + 1, -1),
            reader_loop(Sock, _Type, _Channel, PayloadSize);
        {inet_async, Sock, Ref, {error, Reason}} ->
            io:format("Have a look into this one: ~p~n",[Reason]);
        {heartbeat, Heartbeat} ->
            rabbit_heartbeat:start_heartbeat(Sock, Heartbeat),
            reader_loop(Sock, Type, Channel, Length);
        {ChannelPid, ChannelNumber} ->
            start_framing_channel(ChannelPid, ChannelNumber),
            reader_loop(Sock, Type, Channel, Length);
        timeout ->
            io:format("Reader (~p) received timeout from heartbeat, exiting
        {'EXIT', Pid, Reason} ->
            [H|T] = get_keys({chpid,Pid}),
            reader_loop(Sock, Type, Channel, Length);
        Other ->
            io:format("Other ~p~n",[Other])
*    gen_tcp:close(Sock). %% Non-tail recursive call*

On Mon, May 12, 2008 at 2:07 PM, Ben Hood <0x6e6562 at gmail.com> wrote:

> Ed,
> On 10 May 2008, at 01:02, Edwin Fine wrote:
> Thanks, Ben, I will take a look and give you some feedback.
> In the meantime, I have done the following:
>    - Changed my consumer code (I use the term "consumer" loosely as
>    "anything that eats the output of a producer") to use basic.get instead of
>    basic.consume. Actually, it's set up so that I can select basic.get or
>    basic.consume behavior at run-time. I didn't want to throw away working
>    basic.consume code :)
>    - Changed the process that creates consumers so that it now creates
>    one channel per consumer. Previously, there was one channel only for all
>    consumers. One-channel-per-consumer was the only way I could get the code to
>    work with the network client; in the one-channel scenario I was getting back
>    responses to messages destined for different consumers. I assume that with
>    your changes I will be able to again use one channel for all consumers.
>    - I tested with 50 queues (each with its own consumer and channel)
>    and it seemed reasonably performant, even with the get. I need to try a
>    full-blast test soon.
> I have now commited fix 2 of 3 to the mtn repo which addresses the issue
> of not being able to subscribe concurrently. So the 2 issues you mention
> here should be addressed.
> The outstanding issue is to close the writer down properly in the network
> case.
> HTH,
> Ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20080512/bd6ca729/attachment.htm 

More information about the rabbitmq-discuss mailing list