[rabbitmq-discuss] MQTT SSL handshake failures causes server lockup
Stuart King
stuart.king at me.com
Mon Apr 14 18:00:19 BST 2014
Hi,
I’m running RabbitMQ with the MQTT adaptor enabled along with SSL. I’ve discovered that if a small number of connections fail the SSL handshake (e.g. the client rejecting the server’s certificate because it doesn’t recognize the certificate authority) it prevents other SSL connections being established to the server for a short period of time. If a ‘bad client’ (i.e. one that fails the SSL handshake) keeps trying to reconnect then it effectively locks up the server.
A simple way I found to demonstrate this is by creating my own certificate authority, certificates, keys etc for RabbitMQ as per the instructions at https://www.rabbitmq.com/ssl.html and adding the relevant ssl options to the RabbitMQ config file but put the ssl_listeners element into the rabbitmq_mqtt tuple. Then, implement a basic MQTT client in Java using the Paho client libraries. When the client is ran, an exception is thrown as expected since the certificate isn’t trusted. If this client is put into a basic load test, where a new client is created and attempts to connect to the server every 2 seconds, it will prevent other clients connecting. This can be observed by simply using the "openssl s_client” command, which will just hang during the time the Java clients are trying to connect.
A sample from the RabbitMQ log file when this occurs is:
=ERROR REPORT==== 10-Apr-2014::21:49:33 ===
SSL: certify: ssl_connection.erl:1724:Fatal error: certificate unknown
=ERROR REPORT==== 10-Apr-2014::21:49:38 ===
** Generic server <0.689.0> terminating
** Last message in was {inet_async,#Port<0.15200>,48667,{ok,#Port<0.16417>}}
** When Server state == {state,
{rabbit_mqtt_sup,start_ssl_client,
[[{cacertfile,"/home/sking/ssl/cacert.pem"},
{certfile,"/home/sking/ssl/cert.pem"},
{keyfile,"/home/sking/ssl/key.pem"},
{verify,verify_none},
{fail_if_no_peer_cert,false}]]},
#Port<0.15200>,48667}
** Reason for termination ==
** {timeout,{gen_server2,call,
[<0.718.0>,
{go,#Port<0.16417>,
#Fun<rabbit_networking.1.24135120>}]}}
=ERROR REPORT==== 10-Apr-2014::21:49:38 ===
** Generic server <0.718.0> terminating
** Last message in was {go,#Port<0.16417>,#Fun<rabbit_networking.1.24135120>}
** When Server state == undefined
** Reason for termination ==
** {{badmatch,{error,{ssl_upgrade_error,"certificate unknown"}}},
[{rabbit_mqtt_reader,handle_call,3,[]},
{gen_server2,handle_msg,2,[]},
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
That particular example is on a server running RabbitMQ 3.1.0 / Erlang R15B03, however I have also reproduced it on the more recent RabbitMQ 3.3.0 / Erlang R16B03-1.
I also ran a similar test using AMQP and could not reproduce, so this issue only seems to be with using MQTT over SSL.
Regards,
Stuart
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140414/3bb8e24b/attachment.html>
More information about the rabbitmq-discuss
mailing list