If you are
good.com
, then sending XML to yourself is easy - you can send arbitrary POST payloads using XHR. This of course is not an option from evil.com
.I'll document how I got around it. I didn't see anything similar with a bunch of Google queries, but I somehow doubt it's new. I'm sure I've missed an easier way, too - let me know. (Note that I set myself the goal of not involving plugins).
When submitting a <form> POST, there are three standard form encodings to choose from:
- application/x-www-form-urlencoded - "All characters are encoded before sent (this is default)"
- multipart/form-data - "No characters are encoded. This value is required when you are using forms that have a file upload control"
- text/plain - "Spaces are converted to "+" symbols, but no special characters are encoded"
The first is clearly unsuitable because it does URL encoding. Critical XML characters such as < > " etc. will get mangled. The second sounds ideal because there is no character encoding... but... of course, multi-part POST bodies have the separator lines such as
------WebKitFormBoundary2eC9p3Z2xdIQfdTS
, so are useless to us.The final option will have to do. The encoding of space is not ideal but we could look into using a whitespace-free subset of XML. There's just one catch. The format of the POST body will be a series of name, value pairs:
name1=value1&name2=value2
The trick to save the day here is to use a single name / value pair and abuse the fact that XML is typically full of = characters. So imagine the following XML:
<element attribute="value">node text</element>
Bold and italic are used to show the name used (<element attribute) and the value ("value">node text</element>) respectively. Job done. We could also bury the = in a node value if we didn't want to use attributes.
But wait. The spec for the
text/plain
encoding type specifies that any spaces will be converted to + symbols. This will wreck the space between element name and attribute name and perhaps spoil our fun. It's now down to how the browsers behave. Curiously, it breaks down to WebKit browsers vs. non-WebKit browsers:- Opera, IE, Firefox: do not URL encode; do not replace space with +
- Chrome, Safari: do URL encode; do replace space with +
So this trick will work on some browsers but not others. A note on the specifications for this: the most recent document is obviously the HTML5 draft. The relevant section mentions nothing about replacing spaces with + anymore, so either WebKit doesn't support
text/plain
or it is non-compliant:http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#plain-text-form-data
Thanks to Michal Zalewski for being around to debate ideas!