[rabbitmq-discuss] RabbitMQ broker crashing under heavy load with mirrored queues

Venkat vveludan at gmail.com
Fri Jan 13 05:03:33 GMT 2012


> (You did start_app after the cluster command, didn't you???  :-))

Hi Steve I did restart the the app.
Following are the steps I have performed on both nodes:

Starting the second node t-4:
./rabbitmq-server -detached

Steps to join t-4 node to t-2:
 /usr/lib/rabbitmq/lib/rabbitmq_server-2.7.1/sbin/rabbitmqctl stop_app

 /usr/lib/rabbitmq/lib/rabbitmq_server-2.7.1/sbin/rabbitmqctl reset

 /usr/lib/rabbitmq/lib/rabbitmq_server-2.7.1/sbin/rabbitmqctl cluster
rabbit at t-2 rabbit at t-4
 Clustering node 'rabbit at t-4' with ['rabbit at t-2',
                                         'rabbit at t-4'] ...
...done.

/usr/lib/rabbitmq/lib/rabbitmq_server-2.7.1/sbin/rabbitmqctl start_app
Starting node 'rabbit at t-4' ...
...done.

Running cluster_status on t-4 node:
[ecloud at t-4 sbin]$ /usr/lib/rabbitmq/lib/rabbitmq_server-2.7.1/sbin/
rabbitmqctl cluster_status
Cluster status of node 'rabbit at t-4' ...
[{nodes,[{disc,['rabbit at t-4','rabbit at t-2']}]},
 {running_nodes,['rabbit at t-2','rabbit at t-4']}]
...done.

Running cluster_status on t-2 node (to which t-4 is joined):
[ecloud at t-2 vv]$ /usr/lib/rabbitmq/lib/rabbitmq_server-2.7.1/sbin/
rabbitmqctl cluster_status
Cluster status of node 'rabbit at t-2' ...
[{nodes,[{disc,['rabbit at t-4','rabbit at t-2']}]},
 {running_nodes,['rabbit at t-4','rabbit at t-2']}]
...done.

---------------------------------------------------------------------------------------------------------------
I have been testing with HA feature with different scenario.
In my previous test the messages were pumped in with a SOAP service.
This was pumping messages at slow rate.
I have used a test that pumps in messages by calling plain Java
Service. I have also increased messages pumping in from 20K to 40K.
I am finding that messages are lost while pumping into the queue.
As you mentioned earlier this could be due to connecting to dead
broker.
I modified the producer code by giving 2 seconds lapse of time and
setting a fresh ConnectionFactory as follows:
	@Override
	public void convertAndSend(final Object message) throws AmqpException
{
		MessageProperties props = null;
		try {
			props = new MessageProperties();
			props.setDeliveryMode(MessageDeliveryMode.PERSISTENT);   //setting
delivery mode as PERSISTENT
			send(getMessageConverter().toMessage(message, props));
		} catch (AmqpException amqpe) {
			System.out.println("Exception occurred while sending:
"+amqpe.getMessage());
			try {
				Thread.sleep(2000);
			} catch (InterruptedException e) {
				e.printStackTrace();
			}
			Properties props1 = FrameworkServiceLocator.getInstance().
	
getCommonsConfigurationService(ServiceConstants.DMB_COMMONS_CONFIG_SERVICE).
			getProperties(CommonsConfigurationConstants.RABBIT_MQ_CONFIG_NAME);
			String rabbitMQUser =
props1.getProperty(CommonsConfigurationConstants.RABBITMQ_USER);
			String rabbitMQPassword =
props1.getProperty(CommonsConfigurationConstants.RABBITMQ_PASSWORD);
			String rabbitMQHost =
props1.getProperty(CommonsConfigurationConstants.RABBITMQ_HOST);
			String rabbitMQChannelCacheSize =
props1.getProperty(CommonsConfigurationConstants.RABBITMQ_CHANNEL_CACHE_SIZE);
			CachingConnectionFactory connectionFactory = new
CachingConnectionFactory(rabbitMQHost);
	
connectionFactory.setChannelCacheSize(Integer.parseInt(rabbitMQChannelCacheSize));
			connectionFactory.setUsername(rabbitMQUser);
			connectionFactory.setPassword(rabbitMQPassword);
			setConnectionFactory(connectionFactory);
			try {
				send(getMessageConverter().toMessage(message, props));
			} catch(AmqpException e1) {
				e1.printStackTrace();
			}
		}
	}

After this change is made, I saw an exception occurred once while
sending 40K messages which is as follows:
java.net.SocketException: Broken pipe.
 I have run the test 10-15 times each time 5K-6K messages were lost
but this exception was occurring only once.

Thanks
Venkat


On Jan 11, 12:55 pm, Steve Powell <st... at rabbitmq.com> wrote:
> Hi Venkat,
>
> > This time there were no messages lost. All 20K messages were
> > processed.
>
> That's great.
>
> I'm trying to figure out what might be wrong with
> rabbitmqctl report; I'll get back to you.
>
> Meanwhile, running
>        rabbitmqctl -n rabbit at t-2 status
> ON NODE t-4 might be interesting.
>
> Also, can you tell us the output from
>        rabbitmqctl cluster_status
> on both nodes, please.
>
> It is not clear if you have issued the stop_app and start_app and
> reset/force_reset commands properly (you probably have), so could you follow
> the steps as described in the clustering guide, and issue
> rabbitmqctl cluster_status on both nodes after each cluster change?
> We should be able to see where things went wrong, then.
>
> (You did start_app after the cluster command, didn't you???  :-))
>
> Cheers,
>
> Steve Powell  (a hoppy bunny)
> ----------some more definitions from the SPD----------
> avoirdupois (phr.) 'Would you like peas with that?'
> distribute (v.) To denigrate an award ceremony.
> definite (phr.) 'It's hard of hearing, I think.'
> modest (n.) The most mod.
>
> On 11 Jan 2012, at 01:22, Venkat wrote:> ...
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-disc... at lists.rabbitmq.comhttps://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


More information about the rabbitmq-discuss mailing list