[rabbitmq-discuss] Bug? STOMP adaptor created queues with invalid UTF-8 characters

Mon Feb 13 23:40:25 GMT 2012

On 13/02/12 20:23, Marek Majkowski wrote:
> On Mon, Feb 13, 2012 at 02:17, Toby Corkindale
> <toby.corkindale at strategicdata.com.au>  wrote:
>> I *think* I've found symptoms of a bug in the RabbitMQ STOMP adaptor.
>>
>> We're using /topic/ subscriptions in our applications, and the STOMP adaptor
>> creates some temporary queues in RabbitMQ.
>> If you subscribe to /topic/foo.bar.123 it'll create a queue of the pattern
>> stomp.dsub.foo.bar.123.XXXXXXXXXXXXXXXX
>> where the Xs are random characters.
>>
>> It recently managed to create some with invalid UTF8 in the random section
>> of the queue name. This broke the web management interface and the JSON REST
>> API.
>>
>> =ERROR REPORT==== 13-Feb-2012::13:05:44 ===
>> webmachine error: path="/api/queues//"
>> {error,{exit,{ucs,{bad_utf8_character_code}},
>>              [{xmerl_ucs,from_utf8,1},
>>               {mochijson2,json_encode_string,2},
>>               {mochijson2,'-json_encode_proplist/2-fun-0-',3},
>>               {lists,foldl,3},
>>               {mochijson2,json_encode_proplist,2},
>>               {mochijson2,'-json_encode_array/2-fun-0-',3},
>>               {lists,foldl,3},
>>               {mochijson2,json_encode_array,2}]}}
>>
>
> When I think about it, it's actually quite likely to happen.
>
> There are certain codepoints that mochiweb/mochijson refuses
> to encode. (And mochijson is used by the mgmt plugin).
>
> But under the hood RabbitMQ doesn't do any specific parsing for Unicode,
> for Rabbit every string is a valid one.
>
> I know about two character groups which are invalid:
>   - D800–DFFF see:
>      http://en.wikipedia.org/wiki/Mapping_of_Unicode_characters#Surrogates
>   - and FFFE and FFFF, which mochijson just refuses to encode
>      http://en.wikibooks.org/wiki/Unicode/Character_reference/F000-FFFF
>
> I'm not aware of other invalid codepoints.
> Do you know if in your case you used one of these?

We were using strictly ASCII names.
The invalid characters were turning up in the auto-generated part of the 
queue names, from the STOMP adaptor inside RabbitMQ.

>> Is this intended behaviour? We've never seen this happen before.
>
> Of course not, but it can be controversial to decide what actually
> is an intended behaviour.
>
> In the meantime please avoid using invalid Unicode codepoints :)

See above! :)

Thanks for looking into this,
Toby

-- 
.signature