[rabbitmq-discuss] Cannot send message with STOMP
Lionel Cons
lionel.cons at cern.ch
Fri Apr 27 11:45:44 BST 2012
Steve Powell writes:
> Here is the problem: you are forcing us to escape colons. Why?
Without colon escaping, the header line "a:b:c" is ambiguous and can be
parsed either as (name "a" & value "b:c") or (name "a:b" & value "c").
In STOMP 1.1, the line above is invalid and, depending on the exact header,
it should appear as "a:b\cc" or "a\cb:c".
> I really don't understand this. 1.1 limits headers to UTF-8 sequences.
> There is no conceivable benefit in doing this -- all binary sequences
> could have been used (with escapes for a few) but you chose to restrict
> it.
This came under the assumption that headers usually are a table of text
strings (as opposed to binary strings). With Unicode, most text strings of
most languages can be expressed. With UTF-8, a standard encoding is defined
and it is backward compatible with US-ASCII, that many people use.
The reason behind this being in the spec is that the broker may need to
understand (= decode) what is in the header, for instance for envelope based
routing (e.g. a topic exchange or JMS-style message selectors). If the
encoding is not defined, how do I know how to encode/decode the lowercase e
with acute accent? Should it be 0xE9 (ISO-8859-1) or 0xC3A9 (UTF-8) or
0x00E9 (UTF-16)?
UTF-8 was not the only solution but it was felt standard, widely available,
compact and, last not least, backward compatible with US-ASCII. Another
option would have been MIME-style encoding such as "=?ISO-8859-1?Q?a?="
which is more flexible (allows per header entry encoding) but quite noisy
and probably far less easy to implement than UTF-8. Maybe this will be
reconsidered in a later spec.
> Well, my point is we had (almost) enough already and 1.1 gave us less.
> How is any binary OCTET sequence worse than any UTF-8 text string?
Because it lacks information on how to decode/interpret it.
Cheers,
Lionel
More information about the rabbitmq-discuss
mailing list