New adventures in compatibility testing
Wednesday, December 2, 2009 1:12:03 PM
The bug I'm investigating is a small and ugly one hiding in the document.getElementsByName() implementation - getElementsByName('someID') will find an element with id="someID".
This is of course bad behaviour. That method has nothing to do with IDs and should find elements by name only.
The good news is that it's trivial to fix. The bad news is that it's there for a reason, and the reason is called Internet Explorer. We've been bug-compatible on purpose and while we'd like to remove the bug we have no idea how many sites will break if we do!
So, I'd like an answer to questions like these:
- How many sites use getElementsByName() to find elements with an ID?
- Do these sites break if we fix the bug?
- Do they have alternate code paths for browsers doing it right? If yes, how do they figure out what code to use?
Tools at our disposal: the MAMA web code search engine (an internal Opera project), User JavaScript, and two ad-hoc Opera Unite services.
MAMA tracks sites that might be using document.getElementsByName(). It knows about roughly 45 000 sites where it has seen the string "getElementsByName" in script source code, and it generously provides 5000 random ones in a text file on my request. Naturally, MAMA does only static analysis of the scripts, it can't tell whether the method is actually called or what it was used for.
That information is a piece of cake to get with User JavaScript. A trivial custom script, trackGEBNabuse.js, overwrites the getElementsByName() method with one that will do a bit of debugging and logging on our behalf. And I'm playing with Opera Unite for the first time, with one logging service and one URL player that keeps track of which of the 5000 URLs were already visited and sends Opera to the next one.
(Opera Unite actually rocks! It's fun to write backend-type logic in JavaScript rather than PHP, and it's less hassle while developing to keep all the information, URL lists, log files and scripts locally on the hard drive. I've been undecided about Unite, not sure if it was more important than all the other things we should be spending time on - now I see it's maturing and making itself useful. Nice.)
To walk you through the main logic of things - here's the user JS that overwrites the native method to do logging - commented:
(function(gebn){/* "gebn" is a reference to the actual, native function */
document.getElementsByName=function(name){ /* overwrite the real one */
var elementList=gebn.apply(this, arguments); /* call the native function, record the list it returns */
/* we want to know if anything in the elementList is there due to a matching id rather than a matching name */
var abuse=[];
for(var i=0,elm;elm=elementList[i];i++){ /* go through all returned elements */
if( elm.getAttribute('name')!==name )abuse.push(elm.outerHTML); /* we found one that's probably in the list because of an ID attribute! */
}
if(abuse.length>0){
/* log errors to some server... */
(new Image()).src='http://hr-opera.hallvors.operaunite.com/logger/logGEBN/?data='+encodeURIComponent(abuse.join(', '))+'&href='+encodeURIComponent(location.href);
}
return elementList;/* don't forget to return the list of elements to the waiting script */
}
})(document.getElementsByName); /* this is where we pass the real method as an argument to the function */
As you see, it uses the oldest trick in the book - new Image() - to ping the Unite service with some data. The data is then stored in the folder I told Opera Unite to use when installing the widget.
The only other interesting part is the code that requests the next URL from the URL player - as trivial as doing this from a load event listener:
if(location.hostname!='hr-opera.hallvors.operaunite.com')
setTimeout( function(){ location.href='http://hr-opera.hallvors.operaunite.com/urldriver/nexturl?'+Math.random(); }, 500 );
The urldriver service also accepts the "urllist=somefile.txt" query string argument, so a different user scripts could play URLs from a different file (though not at the same time since the index of what URL one has reached is not stored per-file. That's obviously a bug in my Unite service - keep in mind that these are ad-hoc throwaway services done in 30 minutes of cutting, pasting and typing last night, so don't expect QA and polish :-p).
And the results? Left an Opera 10.10 instance to surf on its own in 5 different tabs overnight, which generated this log file listing 6 unique sites and the HTML of the elements returned in response to a getElementsByName() call due to this bug. Analysing 6 out of 5000 URLs manually is certainly doable
. I'm still worried about getElementsByName() usage that only happens during user interaction, but now at least we know that 0.12% of the sites out there might be at risk from any change and we have some real code to look at. And automated analysis of websites is a new and interesting use case for User JavaScript.








Hallvord R. M. Steenhallvors # Wednesday, December 2, 2009 3:47:46 PM
zoquete # Wednesday, December 2, 2009 4:55:40 PM
lucideer # Wednesday, December 2, 2009 5:46:22 PM
Originally posted by zoquete:
Couldn't you fix the bug in core and add something like this to browser.js:
or is that what the core method already does? (I've never used it)
MyOpera team, please fix this!fearphage # Wednesday, December 2, 2009 6:31:40 PM
lucideer # Wednesday, December 2, 2009 6:53:01 PM
Originally posted by fearphage:
Yup, but that doesn't mean there aren't webpages out there with the likes of:
if(isIE || isOpera){ return document.getElementsByName(id); } else { return document.getElementById(id); }That's a simplified example of course, that probably doesn't exist, but something similarly asinine could be lurking.
serious # Wednesday, December 2, 2009 6:55:58 PM
Charles SchlossChas4 # Thursday, December 3, 2009 7:05:16 AM
Hallvord R. M. Steenhallvors # Thursday, December 3, 2009 10:37:09 AM
Anyway - having done this quick analysis of 5000 websites, and a manual review of the 6 sites that triggered the broken behaviour, I've found only one site where the different behaviour would make a real difference. (It's the very peculiar innerHTML rewriting script on http://mbd.scout.com/mb.aspx?s=193 - as I don't understand why the site does it, I don't know if the changed behaviour would cause us problems..). After this testing I'm a lot more confident about recommending that we simply fix the bug and stop worrying about IE-compatibility here. User JS for compat research rocks. :-D
(PS if anyone knows what browser quirk mbd.scout.com tries to work around with that code please comment
leighmanLeighman # Thursday, December 3, 2009 11:59:20 AM
Aux # Thursday, December 3, 2009 2:18:40 PM
JanGen # Wednesday, December 9, 2009 12:53:36 PM
Then it isn't a bug but a feature.
Silly feature though: how many sites were fixed and how many were broken. Now and in the future.
Don't like the strategy of being bug-compatible, it's short sighted. Stick to proper standards. Opera's userbase is too small to have any marketpower to allow quirks. It will scare off users, designers and developers. Marketeers don't mind if a site is working in Opera or not. Unfortunately.
@hallvors Your blog is good reading, and explains a lot.
@fearphage nice test! Shortest ever?
MyOpera team, please fix this!fearphage # Wednesday, December 9, 2009 4:16:57 PM
ouzowtfouzoWTF # Wednesday, December 9, 2009 5:07:37 PM
Originally posted by fearphage:
The good side seems to have a lack of cookies :/
lucideer # Thursday, December 10, 2009 7:01:11 PM
Originally posted by JanGen:
Neither do I, but oddly you'll find the vast majority of users (well commenters) seem to disagree. Incidentally, Opera do seem to stick to proper standards in more cases than anyone else.
Originally posted by JanGen:
Alas, the opposite is true. Opera's userbase is too small to have any marketpower to disallow quirks.
Originally posted by fearphage:
The "good side" (by your measure) has it's fair share of non-standard.
JanGen # Friday, December 11, 2009 1:57:57 PM
bug-compatible != standards
@lucideer. I think every designer would and should prefer
IMHO browsersniffing is a bad thing:
Recall the TinyMCE problem, developers made a workaround for a bug in Opera. When Opera fixed the bug, TinyMCE broke, because it was programmed like Opera is buggy (bug-compatible) en Opera will be buggy FOREVER.
The only reason I can think of for Opera being bug-compatible, is that Opera is too old and UserJS is too young. At the time of Opera 3.6 there was no Firefox, (maybe a wild idea about phoenix) and there were no established W3C standards, there was a browser battle between Netscape and MS.
Why do you think Opera can't be standard compliant while Chrome or Firefox can.
@hallvors, can MAMA also show how much browsersniffing is done in js-files targeted on Opera to get the script working in Opera. (like lucideers example), furthermore is their a MAMA extension to fetch scripts and test output and performance in different JS-engines automatically.
Can Opera be run from the commandline and dump output before and after js-scripts are run?
lucideer # Friday, December 11, 2009 2:43:39 PM
Originally posted by JanGen:
Originally posted by JanGen:
The point I was making is that Firefox, Chrome and Webkit are a lot less standards compatible than Opera. You're interpreting this post backwards, as if it's normal for Opera and other browsers don't have such intricacies. In actual fact it's most unusual for Opera - but it's something ALL browsers do (most others do it a lot more).
Michael A. Puls IIburnout426 # Friday, December 11, 2009 2:59:36 PM
lucideer # Friday, December 11, 2009 3:08:48 PM
Originally posted by burnout426:
Yeah I've noticed that - stuff that was breaking in Gecko, that previously worked the same in webkit and Opera suddenly breaking in webkit too. Thought it was coincidence, didn't realise it was an official goal - do you have a link to where this is stated?
Michael A. Puls IIburnout426 # Friday, December 11, 2009 3:41:19 PM
Originally posted by lucideer:
Well, *official* is perhaps too strong. I don't think they've directly made that statement. But, judging by http://trac.webkit.org/browser/trunk, bugs.webkit.org, discussions with developers and the favoring of solving bugs by just matching Gecko makes it obvious.
Further, Gecko's table layout algorithm is more advanced than Trident's and Presto's. In situations where the CSS spec leaves things undefined, Gecko does what you would expect while Opera and IE do weird things. Guess what other engine does like Gecko? (I'll see if I can remember an example. It has to do with % widths and tables where Presto will show nothing and Gecko will invent a width to base the percentage on.)
In short, it's no coincidence that you often run into "FF and Safari do the same" when you're testing some Opera bug.
lucideer # Friday, December 11, 2009 3:59:49 PM
Originally posted by burnout426:
I wouldn't really call that being bug-compatible though - more just being compatible in absence of a strict guideline.
Michael A. Puls IIburnout426 # Friday, December 11, 2009 6:23:43 PM
Originally posted by lucideer:
Right. I was just carrying on about how they always seem to favor what Gecko does and that came to mind.
MyOpera team, please fix this!fearphage # Friday, December 11, 2009 7:12:31 PM
Originally posted by lucideer:
So does Opera. Was there some point there? The good side is aware of and tries to adhere to standards was my point. IE does not.Originally posted by JanGen:
Because we don't have the market share to command that people test in/make sites work in Opera. Therefore Opera has to be flexible. So if webkit and gecko have an unspoken/unwritten consensus to handle some piece of code in some way, Opera has to do it that way as well or we (the end users) get broken pages. Opera doesn't have the marketshare to lead in this respect. They must follow the trends (bugs) if they want end users to be able to use sites.Originally posted by JanGen:
Opera is standards compliant. But when there is a bug in Firefox that the people making webpages have tested against and expect, Opera either has to copy that bug or fix it with userjs. Some of them are easier than others.Originally posted by lucideer:
This is simply not true. Webkit and Opera used to be roughly even in terms of sheer numbers of standards/tech supported. However Webkit (and Gecko) have thoroughly embraced html5 already so there is a long list of new hotness that Opera doesn't support yet.lucideer # Friday, December 11, 2009 7:22:20 PM
Originally posted by fearphage:
html5≠Standard, as "in use"≠"finalised" (not that the vast majority of html5 is even in general common use anyway).
But I'm sure you knew I was going to say that, as you and I have had this discussion before. Let's not go off on an off-topic to-and-fro about it
Hallvord R. M. Steenhallvors # Saturday, December 12, 2009 9:53:19 PM
At the moment MAMA is simply a search engine - it indexes lots of information like script file names, and knows a pretty large set of predefined strings like "getElementsByName" which it looks for. Then it builds a big database that can tell us for example what URLs loaded a script file that contained that string.
Of course we also look for evidence of browser sniffing - navigator.userAgent and siblings - but MAMA can't as-is do anything more advanced with that information. We have discussed enhancing it or writing a companion service to run the code and gather statistics, but I certainly don't know what questions I'd be asking, so at this point in time a user js approach, figuring out what I need to know on an ad-hoc basis, is OK
Internal builds can be run with lots of command line options for testing, and you can always log data by setting the "error log enabled" pref.
Daniel15daniel15 # Friday, April 16, 2010 9:39:14 AM