No more "XML parsing failed" errors
By Andreas Bovensandreasbovens. Wednesday, September 28, 2011 2:27:39 PM
Through our Open The Web effort, Opera's Developer Relations team contacts websites that don't work as expected when viewed in Opera, and we suggest changes to their source code that fix the problem.
One of the more common problems we've encountered are sites that somehow serve broken XML to Opera. You can see this on http://www.sharebuilder.com/, http://allhiphop.com/, http://www.excalibur.com/, http://home.mcafee.com/Default.aspx, just to name a few.
The reason for this are broken server setups: when identifying as Opera, certain sites seem to be sending their content with a MIME type of application/xhtml+xml, whereas they send the same content to other browsers with a MIME type of text/html. The reason for this is unclear, but it certainly has to do with broken server-side browser sniffing. We've identified that the issue occurs on certain versions of ASP.NET, but there are also other examples where only Opera gets application/xhtml+xml, while other browsers get text/html.
Now, application/xhtml+xml content would not give any trouble if the XHTML code on these sites was well-formed, but unfortunately, mistakes are easily made: at the smallest error in an XML document, Opera (and other browsers) will throw an "XML parsing failed" message, and a link to "Reparse document as HTML". For non-technical users, this is very confusing, and for advanced users, it's a nuisance.
Over the last couple of years, we've contacted all sites we found are breaking like this, and asked Microsoft to fix the ASP.NET sniffing problem, which they've done. However, as the rollout of this update takes time, there are still a number of sites broken, and it seems like this situation will be with us for quite some time to come.
Hence, we've decided to stop throwing draconian XML parsing failed error messages, and instead, attempt to reparse the document automatically as HTML. Instead of showing an error message in the browser, it's now dumped to the console, so as a developer, you can still find XML parsing error warnings in Opera Dragonfly and the Error Console if you want to.
If you want to play around with this, grab the latest Opera Next build from the Opera Desktop team blog (or wait for the automatic update), and let us know what you think.

1 2 Next »
Charles SchlossChas4 # Wednesday, September 28, 2011 4:10:34 PM
Swapnil RustagiSwapnil99pro # Wednesday, September 28, 2011 4:15:02 PM
Failed test - Test 80 failed: XML well-formedness error didn't stop script from executing
Michael A. Puls IIburnout426 # Wednesday, September 28, 2011 4:21:50 PM
Originally posted by Swapnil99pro:
Opera will probably have to have a default site preference for acidtests.org to turn this feature off. Or, that test will have to be removed from acid3 (some tests were recently removed already).
Igorigorditerni # Wednesday, September 28, 2011 4:59:33 PM
Ciao, Igor
Mağruf ÇolakoğluZAHEK # Wednesday, September 28, 2011 5:00:32 PM
DillonAstrophizz # Wednesday, September 28, 2011 5:34:23 PM
Originally posted by burnout426:
I'm curious how the other browsers that are less strict about XML parsing pass that part of the test.
FransFrenzie # Wednesday, September 28, 2011 6:38:31 PM
Originally posted by Dillon:
What other browsers are less strict about XML parsing? Firefox throws the most unfriendly error imaginable, Webkit renders up to the encounter of the error and IE gives a proper error (but of course not with XHTML). Opera's now the lenient outlier.
Martin RauscherHades32 # Wednesday, September 28, 2011 6:40:36 PM
DillonAstrophizz # Wednesday, September 28, 2011 7:26:14 PM
Originally posted by Frenzie:
Well Firefox, for example, doesn't take you to an error page when, say, visiting http://www.peiwei.com but then I haven't checked to see if they're only sending the xhtml header to Opera. Maybe I misunderstood but I thought this XML change was to automatically bypass that error page, an error page that I don't see in other browsers except in IE sometimes.
Mustafa OğuzinBusteR # Wednesday, September 28, 2011 7:51:27 PM
Pieter DavelooseDaveloose # Wednesday, September 28, 2011 8:17:22 PM
We used to hard-code the MIME type in our asp.net applications!
Thanks!!
Andrei Danieloperaterrestrial # Wednesday, September 28, 2011 8:25:25 PM
FransFrenzie # Wednesday, September 28, 2011 9:25:37 PM
Originally posted by Dillon:
I'm talking about pages where Firefox and Opera actually receive the same thing. My point is XML parsing is strictly — and draconically — defined, so all browsers do the same thing. Except Opera now, that is.
Rijk # Wednesday, September 28, 2011 11:02:01 PM
Opera users got far more of these error pages than users of the other browsers, for reasons explained in this blog post.
Cutting Spoonhellspork # Thursday, September 29, 2011 1:14:02 AM
FransFrenzie # Thursday, September 29, 2011 7:35:30 AM
Originally posted by Rijk:
I've always considered Opera's strategy with the reparse as HTML option superior to especially what Firefox does; I'm merely pointing out that other browsers aren't more lenient.
Swapnil RustagiSwapnil99pro # Thursday, September 29, 2011 9:49:35 AM
Originally posted by burnout426:
Oh yeah, now I understand-
Test 80 failed: XML well-formedness error didn't stop script from executing
The above test means that XML code with mistakes should trigger "XML parsing failed" and not to execute the script. But due to Opera changing this behavior this XML code with mistakes is re-parsed as HTML and the test script on Acid3 can run. So passing this test isn't necessary, right? But I have some questions-
1. Does this affect a website with fully correct XML code?
2. Is this change a right decision in regards to other browsers?
Michael A. Puls IIburnout426 # Thursday, September 29, 2011 10:52:02 AM
Originally posted by Swapnil99pro:
No. Not at all. What this does is reparse markup as HTML that should have been sent as text/html in the first place. Because of browser sniffing, Opera often gets pages like this while other browsers do not.
Opera already allowed you to do the reparse. Now it's just automatic, which makes a lot of sense since everyone that wanted the page to work clicked the reparse link anyway.
The only problem with this is that when you encounter real xml markup that was meant to be sent as application/xml or application/xhtml+... but has an error, Opera will no longer show you an error message telling you something is wrong, which is useful when authoring an xml file and useful for error detection so you can report the problem to the site. Also, a real xhtml page with an error that uses scripting that is reparsed as HTML will load, but might not behave properly, which can give false expectations.
Browser vendors have argued about the pros and cons of this for years. We'll just have to see if this takes off.
Igorigorditerni # Thursday, September 29, 2011 4:57:50 PM
Originally posted by Rijk:
It is the XHTML standard.An error page with a countdown of some second and then an automatic reload reparsing as HTML can be a better and correct solution.
Ciao, Igor
Rijk # Thursday, September 29, 2011 5:54:27 PM
QuHno # Thursday, September 29, 2011 6:04:51 PM
"This page uses broken XML but we will try to reparse it as HTML with error correction in [5s countdown] second(s) to retrieve all content possible.
If you want to see the full error message instead: [Click here]"
Igorigorditerni # Thursday, September 29, 2011 6:27:53 PM
Originally posted by QuHno:
This can be good and clear for all, as i suggested too.Originally posted by Rijk:
But the error have to be seen, no one can see it if he have not asked to see the error console and it is not istructive to repharse automatically because people have to know the bad work of some webmasters instead of telling that Opera is not well working and a correctly XHTML made website can work in bad way if reparsed.So reparse have to be clearly declared (can be also a pop-up message or at the top of the page, like the save password bar).
Ciao, Igor
FransFrenzie # Thursday, September 29, 2011 6:29:11 PM
Originally posted by QuHno:
If you go that route I think it'd be better to show it as a non-interrupting notification (akin to the save password dialog).
QuHno # Thursday, September 29, 2011 7:54:40 PM
(Sorry for the x-post double-rant, but I think it is important)
Marco Maierschwiebie # Thursday, September 29, 2011 7:58:53 PM
Originally posted by QuHno:
That's the way Opera did in the past. Ok .. without a countdown.
That couldn't be the solution. Most of the Internet users aren't geeks like us. And when a site doesn't load, it's a browser problem ... other browsers obviously hasn't this problem ... so a normal user changes.
I know it's a dilemma, but I think it's the best solution.
Could there any better for 'normal' users?
QuHno # Thursday, September 29, 2011 8:26:11 PM
It looked a little bit like a blue screen of death but in gray ant with a cut off big red O in the top right corner, the same as other internal pages, so it must be an internal parser mistake because there was no hint, that it was the page that caused it and not the parser that broke.
[retry][abort][ok]</windowsjoke>
The changed message bears a different and clear non-tech message:
The page, not this browser is broken. We will do our best to repair it in some seconds. This is just to inform you, no need to worry, all will be good in some seconds (hopefully).
... with an option to get the tech version for all who are interested in it. IMHO great psychological difference.
Please read the above with the eyes of a normal user - so to say a user who will never look at this blog post or these comments
SteveKong # Thursday, September 29, 2011 8:34:54 PM
Consider this: Would we have actually identified those non well-formed sites? Honestly, I don't think so. Probably the ASP bug wouldn't have been fixed either.
Marco Maierschwiebie # Thursday, September 29, 2011 8:56:22 PM
Originally posted by QuHno:
I'm also a software developer ... no one read such a message. There's an error, that other browsers haven't ... that's it.
I understand your intention ... but the people don't fit to it.
Change the people ... or change the browser.
I appreciate that Opera takes the easier way ...
QuHno # Thursday, September 29, 2011 9:36:53 PM
Originally posted by schwiebie:
... and it recovers automatically, which other browsers obviously can't - at least Fx and IE didn't the last time I saw such an "XML parsing failed" message.Advertisement in big, neon shining letters:
Opera is the first (and only) browser that can automatically recover from XML parsing errors in just 5s!
DillonAstrophizz # Thursday, September 29, 2011 10:05:26 PM
Originally posted by QuHno:
Then someone else will do it in 4 seconds, or instantly
QuHno # Thursday, September 29, 2011 10:11:43 PM
... after all are doing it we just need an option for those people that are looking for bugs so that they don't need to have the error console constantly popping up, just a tiny option in opera:config like:
Show XML parsing errors in tabs [X]
... and everybody is happy.
Hm, could we just get this option before all others do it?
Igorigorditerni # Thursday, September 29, 2011 10:50:06 PM
Originally posted by schwiebie:
But the error is in the MIME sent to the browser, not in the browser so, why not to crate a workaround for that sites using something like the browser.js. The other browsers will continue to show the error page for erroneous pages when the MIME will be "application/xhtml+xml" and Opera not.Cutting Spoonhellspork # Friday, September 30, 2011 12:31:26 AM
A good placement for the seldom-used "report a site problem" help option.
zoquete # Friday, September 30, 2011 12:46:37 AM
QuHno # Friday, September 30, 2011 4:05:02 AM
Originally posted by hellspork:
+1 for all pages that were repaired, vanishing after some seconds just like it is now with the "Wand Store Toolbar"
additionally a "I don't care" button, that hides it forever for that page or globally for all who don't care and a "XML parsing failed, click [here] to view the error page" if it was an XML parsing error.
This way everyone is happy:
Those who don't care never have to see it again, all others can see if a page was corrected and can report still occurring errors with one click.
Some more ideas:
For the error infobar I could even imagine something like: "You can mark the problematic area in the page and write something about the nature of the problem." to make the work for the poor guys who have to write the browser.js easier.
If the XML is b0rked, wouldn't it be better to test with a second request with a masked UA (saying in the comments section that this is done because of the WAI), if there is such a version for i.e. Firefox (or any other arbitrary header) and if yes retrieve that page?
That could be better sometimes than re-parsing a broken page, which might fail anyway if not only the MIME-type, but the content too was changed due to wrong browser sniffing.
FransFrenzie # Friday, September 30, 2011 8:12:02 AM
Originally posted by QuHno:
Agreed.
Rijk # Friday, September 30, 2011 9:06:11 AM
Originally posted by igorditerni:
Because there's too much of this, so the browser.js/override.ini solution doesn't scale.Originally posted by zoquete:
Because users blame the browser when pages break, especially when it seems to work fine on other browsers.zoquete # Saturday, October 1, 2011 3:12:11 AM
Originally posted by Rijk:
sorry, tell me in what other browsers aren't xml-paring errors displayed. Thanks a lot
Swapnil RustagiSwapnil99pro # Saturday, October 1, 2011 8:28:43 AM
Originally posted by zoquete:
No there aren't any browsers where XML-parsing errors are not displayed except Opera. But Mr. Rijk is not talking about displaying XML parsing errors displaying, he is talking about this -
Websites give XML/XHTML content to Opera and HTML content to other browsers. Very minor HTML coding mistakes are handled by these other browsers and they don't need to get to into the XML matter. Now Opera who does not get HTML code has to parse the XML code only. Unlike HTML, even minor mistakes in XML code trigger "XML parsing failed". Now average users think that the site is working in other browsers and not Opera, because they don't know that Opera is rendering XML code while other browsers are rendering HTML. So automatic reparsing of broken XML code as HTML was introduced just for the normal users of Opera.
Michael A. Puls IIburnout426 # Saturday, October 1, 2011 11:40:00 AM
Originally posted by Swapnil99pro:
Which makes perfect sense as it's often just the case that the server is telling Opera the HTML markup is XML when it's not. Parsing it as HTML (as originally intended by the author that made a browser-sniffing mistake either directly or indirectly) is the right thing to do.
If there's an error, it might be possible to detect that the markup looks like HTML and only do the reparsing in that case and still show an error in other cases, but that could be difficult to get the detection right and robust. The way Opera does it is just easier to code.
Robert MeijersRobert90 # Saturday, October 1, 2011 3:54:56 PM
Originally posted by burnout426:
You'll know the difference between application/xhtml+xml and application/xml, right? It's all in the name
If the above comment from Swapnil Rustagi about Acid failing in an XML test is right (not running next so can't test, and not in the mood to read the acid test), it's Opera's fault, no matter how you look at it. If the content type is application/xml Opera shouldn't be reading it as HTML if the markup is faulty (because it's just plain XML, and not HTML packaged as XML). If the content type is application/xhtml+xml, I still think Opera should give some kind of error (and I agree the current/previous message is not user friendly), but now they decided to just ignore the fault (who looks at the error console anyway?) and reparse it as HTML.
proghead # Saturday, October 1, 2011 4:00:43 PM
Michael A. Puls IIburnout426 # Saturday, October 1, 2011 5:58:30 PM
Originally posted by Robert90:
I'm specifically talking about a page like this:<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title></title> </head> <body> <p><input></p> </body> </html>that ends up being served to Opera as text/xml, application/xml or application/xhtml+xml. In the case of application/xhtml+xml, you can pretty much assume that it's some kind of html content (although the assumption could still be wrong) and reparse as HTML if there's an error. But, with application/xml and text/xml, you could have a real XML file that just happens to have an html element (the document element in this case). But, without xmlns="https://www.w3.org/1999/xhtml" somewhere (it shouldn't be there for content meant to be served as text/html unless you're using a polyglot document), you're making a much bigger assumption based on just the document element having a name of "html". So, if you want to be more confident that it's really HTML content, you might want to also look at other things in the document (like the presence head, body, title, meta) before you reparse. But, if there's a parse error really early, there's not going to be much to look at (unless Opera does a string search on the source). So, yeh, it's possible to limit the reparsing to pages that look like HTML. But, if I have a real xml file like:<html> <zipzambam></zipzambat> </html>, I'd still want Opera to show that there's a parsing error. Given that, it's just easier to have reparsing for all xml files with an option to turn it off when developing/testing. But, maybe everyone is fine with assuming that XML markup (like that even) is HTML where Opera should reparse on error. That would allow Opera to limit the reparsing to things that look like HTML.Igorigorditerni # Saturday, October 1, 2011 9:26:52 PM
Originally posted by burnout426:
I have a test XHTML page, in one of my websites wrote in really valid XHTML, that contains a simple stupid error with the purpose to show the error page to teach browsing people the right way a browser have to show it and i have every time suggested to use Opera for the best result... I will have to delete the suggestion if the page will be displayed in Opera 12 (like now is displayed with this version).Ciao, Igor
QuHno # Saturday, October 1, 2011 9:35:42 PM
XMLin a square, just like the icon for RSS feeds. If it is visible, all "knowing" people would know, that there is something smelling rotten and with a click on it the error console could open to hopefully show the same detailed error message like the former error page.That would be unobtrusive IMHO.
I believe if I think long enough I can come up with several other ideas for showing in an unobtrusive way that the page was re-parsed to ease the live of people who want to know whats wrong.
Rijk # Sunday, October 2, 2011 12:35:05 AM
Note: I'm not involved in testing or developing this feature!
QuHno # Sunday, October 2, 2011 1:27:24 AM
Imagine an extension or some UserJS that trusts in a certain structure of a webpage. Now imagine a server, that sends Opera a broken X(HT)ML page. Opera silently re-parses it and you know what the JS will see after an error correction. Sometimes that has not much to do with the structure intended by the author of the page.
The extension fails. The user has no hint why and writes a bug report to the extension or userJS author. If there were a hint, the user at least would have a chance to recognice, that there was something else that broke.
Off course X(HT)ML pages should be valid - but several of these pages are compiled by CMS and you know that there is at least one coding error in each 1000 lines of code and error correction in Opera is (still) some kind of sophisticated guesswork (yes, I know that will change in future when the HTML5 parser is up and running - thanks to the well defined error correction that is part of the spec)
BTW: Re-parsing on speculation that broken X(HT)ML is HTML content with the wrong MIME Type is like using the millions of users as guinea pigs too
edit:
PS: Another case happened to me today - some syndicated content not under my direct control suddenly had an error. The page itself was valid XHTML strict before (even Schema validated). Opera 12 in its clean-install default settings silently switched to HTML and failed miserably in displaying the page. It would have been better to get the error or at least a hint why it failed. There was nothing, in the default install I even had to open the error console manually.
My very personal opinion:
AFAIK the spec says that an error must be shown. An error message I can't see in the default settings without my active interference is IMHO not shown.
Is some half hidden 16*16px indicator really too much? Would it really annoy millions of Opera users?
If not: Can we have it? I think it would make several thousands of developers happy
SteveKong # Sunday, October 2, 2011 11:12:53 AM
Additionally, I have to go along with Michael A. Puls II, It has to be ensured that non-well-formed 'ordinary' XML does not accidentally parsed as HTML.
Robert MeijersRobert90 # Sunday, October 2, 2011 11:59:03 AM
SteveKong # Sunday, October 2, 2011 3:14:29 PM
Robert Maijers suggestion sounds pretty smart. Dear Developers, what do you think about that strategy?