[rabbitmq-discuss] Cannot send message with STOMP

Fri Apr 27 09:35:34 BST 2012

Thans Lionel,
(*The very tardy reply is my fault -- sorry*)

>> the rather arbitrary notion of escaping colons in header values as well
>> as names (why?) [...]
> 
> Without backslash escaping, header names and values would have been
> limited. This quite simple escaping removed this restriction.

Here is the problem: you are forcing us to escape colons. Why? The
terminator for a header value is the newline. Yes, to allow newlines we
need an escape for newline (and therefore for backslash) but not for
colon. Never for colon.

>> and the insistence of UTF-8 for all header names and values [...]
> 
> Here again, without this in the spec, the headers would have been
> limited to basically US-ASCII.

I really don't understand this. 1.1 limits headers to UTF-8 sequences.
There is no conceivable benefit in doing this -- all binary sequences
could have been used (with escapes for a few) but you chose to restrict
it. All this does is make the properly compliant server's job harder.
And, by the way, clients too.

One day, when two clients choose to represent the same UTF-8 string
using different UTF-8 encodings (with, say, resequenced decorations or
whatever) the spec will cause problems for implementations and clients
alike.

> Some users (us included) do need more. This widely used encoding allows
> any text string in the header.

Well, my point is we had (almost) enough already and 1.1 gave us less.
How is any binary OCTET sequence worse than any UTF-8 text string?

> The BNF indeed allows more than what the protocol accepts. The best
> solution is probably to ignore the BNF and only implement what is in the
> protocol text. After all, it's probably two orders of magnitude shorter
> than AMQP 1-0 ;-)

Well, I cannot deny that :-)

If UTF-8 wasn't there it would have been easy to improve so that the
valid forms of a frame were described in BNF. The only errors not then
identifiable would have been semantic ones (like use of an unknown id).

> Maybe you also want reject empty header names (e.g. SEND\n:foo\n\n\0).
> This is arguable.

I'll see what we currently do. It might be valid. Leading spaces? A space?

Incidentally, the spec(s) allow the server to not send an ERROR frame if
it doesn't want to, but I can see that if nothing is logged at all it could be
frustrating -- we will try to improve this.

Steve Powell
steve at rabbitmq.com
[wrk: +44-2380-111-528] [mob: +44-7815-838-558]

On 20 Apr 2012, at 14:58, Lionel Cons wrote:

> Steve Powell writes:
>> The STOMP 1.0 specification is unsatisfactory in many respects, [...]
> 
> Indeed. This is what triggered the creation of 1.1.
> 
>> The STOMP 1.1 specification has drawbacks of its own -- [...]
> 
> Although I agree that 1.1 is not perfect (so I'm sure we will see 1.2
> or 2.0 one day), I do not share your view on the points you mention.
> 
>> the rather arbitrary notion of escaping colons in header values as
>> well as names (why?) [...]
> 
> Without backslash escaping, header names and values would have been
> limited. This quite simple escaping removed this restriction.
> 
>> and the insistence of UTF-8 for all header names and values [...]
> 
> Here again, without this in the spec, the headers would have been
> limited to basically US-ASCII. Some users (us included) do need
> more. This widely used encoding allows any text string in the header.
> Icing on the cake: it's backward compatible with 1.0 and transparent
> to implementations using only US-ASCII.
> 
>> So I think it is _incorrect_ in that it specifies a considerable
>> superset of what you can receive (or are allowed to send) over the wire.
> 
> The BNF indeed allows more than what the protocol accepts. The best
> solution is probably to ignore the BNF and only implement what is in
> the protocol text. After all, it's probably two orders of magnitude
> shorter than AMQP 1-0 ;-)
> 
>> I'm going to raise a bug [24896] to track this, and propose the
>> following solution:
> 
> Thanks.
> 
>> We use the negotiated connection type to determine the parsing rules
>> (and generating rules) for frame headers.
> 
> Perfect.
> 
>> For 1.0 connections we will allow OCTET streams for header names and
>> values, excepting colons in names, and excepting newlines (\u000A
>> newline character coded in UTF-8 is x'0A') anywhere, and do NO escaping
>> either on input or output. We will reject no headers on input unless
>> they do not contain a colon before the first newline (because they
>> cannot be malformed under these rules) though the remaining frame
>> parsing can fail.
> 
> Maybe you also want reject empty header names (e.g. SEND\n:foo\n\n\0).
> This is arguable.
> 
>> For 1.1 connections we will allow OCTET streams for header names and
>> values, excepting colons, newlines and backslashes, interpreting escape
>> sequences for these, and generating escapes for these characters on
>> output. We will NOT reject non-UTF-8 OCTETs but instead just pass them
>> through asis in order to allow the simplest client access. We will NOT
>> reject other escapes (\ followed by non c, n or \), but leave these
>> asis.
>> 
>> Although this is not strictly what the spec says (we would not enforce
>> UTF-8 or the errors on bad escape sequences) I believe it is a
>> reasonable compromise.
> 
> Fair enough.
> 
> IMHO, the main point is that RabbitMQ should accept any valid STOMP
> 1.1 input. Your proposed solution seems address this point.
> 
> If you also accept invalid input, this should not be a problem in
> practice.
> 
> Thanks for your work improving RabbitMQ's STOMP support.
> 
> Cheers,
> 
> Lionel