Binary XML.
Clemens Vasters, Ingo Rammer, and Brad Wilson are all debating binary XML.
I think they all agree that for long-term storage you need to leave it as text. For transient messages, Binary XML is a valid option in my mind. Especially when you want to send XML to a wireless device.
One thing that Brad mentioned was solving the endianess problem. I'm not sure why he put this as a con. This has already been solved in the XML 1.0 world. Take a look at UTF-16LE, UTF-16BE, UTF-32LE, etc. They have a nice 2 or 4 byte order mark (BOM) as the first 2 or 4 bytes of the data. This tells the processor the format and the endianess. In fact most XML parsers are smart enough to not need that BOM. Because the 1st character represented in XML has to be '<'; So I see this as a non-issue.
I have a couple of issues though after thinking about Binary XML in general -
- Is data still going to be represented as strings? I.e. I have the following fragment 10. To the human eye that is a number. So in my binary serialization do I serialize that as 1 byte for the value or 2 bytes (assuming UTF-8) for the characters '1' and '0'? Or even 4 bytes because it is a 32-bit integer?
- This relates to the first one - if I do serialize the data as its native representation (i.e. a 4 byte integer), aren't I requiring an XML Schema for that XML now? Do I have to have some type of embedded pseudo schema that says 'units is a 4 byte integer'?
Let's say we do continue to save everything in the binary as characters (i.e. 2 bytes for '1' and '0'). What is binary saving us? The <, >, and whitespace?
I haven't thought it all the way through yet. And these are some questions that I've raised in my own head. Maybe someone else already knows the answers. [News from the Forest]
5:51:57 PM
|