[rabbitmq-discuss] best practice for "work queue" type application

Jerry Kuch jerryk at rbcon.com
Sun Nov 4 16:30:58 GMT 2012


Hi, John:

On Fri, Nov 2, 2012 at 7:19 PM, John Cartwright
<john.c.cartwright at noaa.gov>wrote:

>
> The example suggests that the worker wait until the processing is
> complete to send the ACK for a given message.  So if my processing
> takes many hours, is this a problem for RabbitMQ to have the worker
> process consume a message but wait for a very long time to acknowledge
> it?
>

The answer is that it depends.  :-)

While your message is in the "consumed but unACKed by the consumer" state,
the broker is holding on to a copy of the message to guard against the
possible failure of your consumer (which will be detected by seeing the
connection close unexpectedly) by redelivering in the future if necessary.

If the message was published with persistence requested it may be sitting
on disk during this period.  Even if it wasn't published persistently, but
the broker came under memory pressure, the message may still have been
swapped out to disk.  In both cases, keeping that message alive will eat
some disk space.  It will also eat some memory, both for the message
itself, assuming it hasn't been swapped out due to memory pressure, and a
tiny bit of bookkeeping overhead.

Whether that becomes a problem or not while your long running consumer is
chewing on the message depends on just what sort of resources are available
in your broker (RAM and disk particularly) couner-balanced against just how
many other messages are passing through the broker, and potentially also
languishing in this state for a long time.

Rabbit can queue and persist messages pending their delivery and
acknowledgment, but it's best not used deliberately as a long term store of
data.  A handy mantra is that "A happy Rabbit is an empty Rabbit."  In
other words, Rabbit (and other messaging systems) runs best when it's able
to deliver messages in a reasonably timely way, although if things block up
due to crashed or slow consumers or the like, mechanisms exist to keep
things going, as long as you don't resource exhaust the universe.


> Is there a better way to handle this scenario?
>

Again, it depends.  If you have a good sense for how much of your traffic
is going to result in these long-working consumers, and you can bound that
to some reasonable extent to avoid having vast piles of messages in the
broker, delivered but waiting to be ACKed, then you might be just fine
doing things the way you describe.

If you expect that you'll have a ton of these slow-to-ACK messages, you may
want to consider a slightly more elaborate messaging structure, say where
the consumers ACK the message as soon as they've persisted a local copy to
work on, and use another queue to notify any upstream services or
publishers who care that the work in question has been finished.

Then again, perhaps the publisher side doesn't need to be directly informed
that consumer work is done, in which case your workflow on the consumer
side would be to consume the message, persist it to local storage so that
you have a safe copy of it you can resume work upon if your consumer dies,
and then ACK it, thereby accepting responsibility for the message's
contents from the broker and allowing the broker to forget about the
message and stop worrying about having to redeliver it.

Make sense?

Jerry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20121104/b86d2e91/attachment.htm>


More information about the rabbitmq-discuss mailing list