[rabbitmq-discuss] Cannot send message with STOMP

Fri Apr 27 11:45:44 BST 2012

Steve Powell writes:
 > Here is the problem: you are forcing us to escape colons. Why?

Without colon escaping, the header line "a:b:c" is ambiguous and can be
parsed either as (name "a" & value "b:c") or (name "a:b" & value "c").

In STOMP 1.1, the line above is invalid and, depending on the exact header,
it should appear as "a:b\cc" or "a\cb:c".

 > I really don't understand this. 1.1 limits headers to UTF-8 sequences.
 > There is no conceivable benefit in doing this -- all binary sequences
 > could have been used (with escapes for a few) but you chose to restrict
 > it.

This came under the assumption that headers usually are a table of text
strings (as opposed to binary strings). With Unicode, most text strings of
most languages can be expressed. With UTF-8, a standard encoding is defined
and it is backward compatible with US-ASCII, that many people use.

The reason behind this being in the spec is that the broker may need to
understand (= decode) what is in the header, for instance for envelope based
routing (e.g. a topic exchange or JMS-style message selectors). If the
encoding is not defined, how do I know how to encode/decode the lowercase e
with acute accent? Should it be 0xE9 (ISO-8859-1) or 0xC3A9 (UTF-8) or
0x00E9 (UTF-16)?

UTF-8 was not the only solution but it was felt standard, widely available,
compact and, last not least, backward compatible with US-ASCII. Another
option would have been MIME-style encoding such as "=?ISO-8859-1?Q?a?="
which is more flexible (allows per header entry encoding) but quite noisy
and probably far less easy to implement than UTF-8. Maybe this will be
reconsidered in a later spec.

 > Well, my point is we had (almost) enough already and 1.1 gave us less.
 > How is any binary OCTET sequence worse than any UTF-8 text string?

Because it lacks information on how to decode/interpret it.

Cheers,

Lionel