<div dir="ltr"><div>Graeme,<br><br></div>Note that it is possible to change the file size via the <span class="">msg_store_file_size_limit parameter (see <a href="http://www.rabbitmq.com/configure.html">http://www.rabbitmq.com/configure.html</a>). While I've not encountered any issues when changing this parameter I would suggest that appropriate care needs to be exhibited (i.e. thorough testing).<br>
<br>Brett<br><br></span></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Nov 21, 2013 at 8:10 AM, Graeme N <span dir="ltr"><<a href="mailto:graeme@sudo.ca" target="_blank">graeme@sudo.ca</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="im"><div class="gmail_quote">On Wed, Nov 20, 2013 at 12:11 AM, Matthias Radestock <span dir="ltr"><<a href="mailto:matthias@rabbitmq.com" target="_blank">matthias@rabbitmq.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>On 20/11/13 04:38, Graeme N wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Rabbit at the moment doesn't make much effort into making its<br>
workload append only and contiguous by default<br>
</blockquote>
<br></div>
Actually all disk writes in rabbit *are* append-only (plus the occasional truncate). But when dealing with multiple disk-bound queues, there is inevitably a fair bit of seeking going on. And SSDs are much quicker at that.<span><font color="#888888"><br>
</font></span></blockquote><br><br></div></div><div class="gmail_quote">Well, I guess that multiple small files, even if they are individually written as append only, doesn't really qualify as an append-only workload in the sense I'd normally think of it. The reason to use an append-only workload is to ensure it performs well on spinning disks by avoiding seeks. Looking at the files under /var/lib/rabbitmq/mnesia/rabbit@hostname/msg_store_persistent on our busiest host, and we see ~390 files, all <= 17 MiB. Compare to MySQL or Riak, where we split the InnoDB or Bitcask append-only logs at 500 MiB, which is chosen so it fits inside of our hardware RAID controllers' BBU caches.<br>
<br>Our big messages in RabbitMQ are 500kiB to 4MiB, so each of these 16 MiB chunks only stores ~10 of these messages, which is far smaller than our typical batch size of ~100 messages. Relatively speaking, this is a ton more overhead in many small files than we typically see in other datastores that bill themselves as spinning disk friendly / append only. When you consider that those other data stores use a single set of append-only logs for all queues/tables/buckets in the system, and rabbit seems to be segmenting them based on queue, it means that rabbit's workload contains at least an order of magnitude more seeks when busy than these other storage engines.<br>
<br></div><div class="gmail_quote">So, while rabbit's workload at the file chunk level may technically be append only, it doesn't actually seem to avoid seeks in its workload, and we tend to see pretty massive persistent queue performance improvements by using SSDs. Thus, I wouldn't really consider its workload in the same class as most append-only storage systems.<span class="HOEnZb"><font color="#888888"><br>
<br></font></span></div><span class="HOEnZb"><font color="#888888"><div class="gmail_quote">Graeme<br><br></div></font></span></div></div>
<br>_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
<br></blockquote></div><br></div>