Changes to the behavior of innerHTML in XML documents
By Patrick H. Laukepatrickhlauke. Thursday, September 13, 2012 4:20:19 PM
In the run-up to the next stable desktop release, you'll notice that a lot of changes are being made to our browser's core engine. Although here on the Developer Relations blog we usually just cherry-pick and explain some of the shinier new additions that fall under the big "HTML5" umbrella, there are also quite often tiny improvements under the hood that remove bugs and browser incompatibilities that don't get much notice...until sites that somehow relied on our previous behaviour start to misbehave.
A recent example of this is the rather innocently titled CORE-4336: Setting innerHTML in XML
which was recently shipped in one of our Opera Next snapshots.
Being a hip and happening developer, you may be thinking "XML? HTML5 is where it's at!"...so it may come as a shock to you that in the vast reaches of the web, there are still a sizeable number of sites that use XML/XHTML.
While the release of Opera 12.10 is still a bit away, one of Opera's products that already does include many of these core changes, including CORE-4336, is the newly released Opera Device SDK 3.4, which is used by many TV and set-top box device manufacturers to provide web browsing functionality. And it's on this platform that some of our customers have started to report issues with the latest SDK relating to this particular core change.
In previous versions of Opera's core, innerHTML was quite forgiving when injecting markup into XML/XHTML documents. As with regular HTML pages, when trying to add malformed content, Opera silently error-corrected the injected fragment according to its HTML parsing algorithm.
To see this in action, here's a simple test case using innerHTML to inject the classic <b><i>...</b></i> set of misnested tags into an XHTML document. If you take a peek at the DOM after the page was loaded you'll see how the misnested tags have been silently fixed in the current stable version of Opera, while the same test case will fail in other browsers such as Chrome and Firefox.
Following the fix to CORE-4336, Opera's core is now aligned with the stricter behavior of other browsers, which has been formally specified in WHATWG's DOM Parsing and Serialization:
In the case of an XML document, [innerHTML] will throw anINVALID_STATE_ERRif the Element cannot be serialized to XML, and aSYNTAX_ERRif the given string is not well-formed.
In the long run, this fix will ensure greater cross-browser compatibility...but obviously, if your XHTML sites start to misbehave and throw errors as a result of this change, the best advice we can give is to ensure that injected markup fragments are sanitised to ensure that they're well-formed XHTML (no misnesting, correct use of quotes around attributes, etc). If this is not possible, a short term (though admittedly quite inelegant) fix would also be to change your site from XHTML to HTML.

lucideer # Thursday, September 13, 2012 9:00:54 PM
Originally posted by patrickhlauke:
This seems to imply that that "sizeable number of sites" are somehow behind the times and "out of touch" with HTML5, whereas anyone familiar with the spec. will be well aware XML is a *part* of HTML5.
Patrick H. Laukepatrickhlauke # Thursday, September 13, 2012 9:14:26 PM
lucideer # Saturday, September 15, 2012 1:25:03 AM
Originally posted by patrickhlauke:
I'm not sure what you mean by this. The XML Spec. may not be - in it's entirety - included within or subsumed by HTML5, obviously, but the the HTML5 specification is itself entitled "A vocabulary and associated APIs for HTML and XHTML" - XML is most definitely a considered part of the HTML5 spec.
Originally posted by patrickhlauketo:
This is technically incorrect; XHTML5 is a buzzword rather than a separate specification in itself - the term isn't referenced anywhere in any spec. that I've read sofar - (although one could certaintly argue that HTML5 itself is a buzzord). The XML serialization of HTML5 (which is within the same HTML5 spec.) explicitly states that a DOCTYPE (of any kind) is neither required nor recognised - as XML processors aren't required to query or parse it.
Patrick H. Laukepatrickhlauke # Saturday, September 15, 2012 9:14:42 AM