[rabbitmq-discuss] Large Message Support / Stability

Tue Jul 30 22:17:22 BST 2013

We have taken your alternative approach and write large payloads to S3.
They go to encrypted private buckets and we generate signed URLs to allow
access by downstream components. We employ content distribution networks
for some payloads.

As they are general purpose, we have retooled our messaging pipelines to
pass opaque messages, generally compressed binary. It is expected that any
content necessary for filtering and routing will be promoted to AMQP
headers where it is easily accessible without requiring access to the
message body. This has worked quite well in practice with our internal
clients.

Like you, we gather a diverse set of inputs. However we distribute payloads
globally, sometimes to millions of endpoints, and that affects our approach.

Michael Laing
NYTimes

On Tue, Jul 30, 2013 at 4:06 PM, Alexander Schatten <
alexanderschatten at gmx.at> wrote:

> I am considering to use RabbitMQ as messaging platform for a data
> warehouse (today we would probably call it "Big Data") type application.
>
> The principle issue is, that we receive data from a large variety of
> sources; partly batch updates, partly events from sensors and the like. The
> idea is to have service interfaces for different data sources to the
> "outside" that accept the data packages and events, enrich it with metadata
> etc. and then dump it on RabbitMQ queues and topics.
>
> Consumers are essentially processing units that do statistical analysis,
> error/warning trigger and storage modules that write raw and aggregated
> data into databases (PostgreSQL and most likely MongoDB).
>
> Now, certain messages, e.g. sensor events will be rather small in size.
> Others like batch updates from ERP and CRM systems or containing documents
> might be larger; I suppose several MB up to 100 MB. I have not used
> messaging in such a context, but what I heard is, that other message
> brokers tend to have problems with large messages or start to behave
> erratic.
>
> so my concrete question: has anyone experience with such a use case?
>
> alternatively it would be possible, to write the (large) payload
> immediately (at the service interface) into e.g. MongoDB and only put the
> reference/ID in the message. However, this would break the decoupling to a
> certain extent as all consumers need access to MongoDB or to a REST
> interface that serves the payload. Also message filtering and content based
> routing is limited in the latter case.
>
> Would appreciate any comments on this issue.
>
> Kind regards,
>
>
> Alexander Schatten
>
>
> --
> ==========================================
> Dr. Alexander Schatten
> ==========================================
> http://www.schatten.info
> http://sichten.blogspot.com
> Follow me at
> https://alpha.app.net/alex_buzz
> http://twitter.com/alex_buzz
> ==========================================
> "Die Grenzen meiner Sprache
> bedeuten die Grenzen meiner Welt.", Wittgenstein
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130730/50b3c88e/attachment.htm>