Skip navigation.

miscoded

the web is a hack

STICKY POST

Introduction

My journal is Opera-related and technical. It will cover the main obstacles we come across when we use Opera on the Web as it is - the standard violations, the browser incompatibilities, the sniffers and faulty scripts. That is the whole mess a poor browser has to make sense of and believe me, Opera is doing a brilliant job.

Tracking down finnair.com's missing i

, , ,

I guess Finnair is the household's favourite airline. I don't think I've used any other carrier during the last three years or so - we don't travel that much, so we're not talking many flights overall. However, it was rather annoying to find its website broken last time I wanted to look up some prices:

Finnair site auto-complete menu broken, error in console

After selecting a local site, the auto-complete menu to choose departure and destination cities from never populates. What's worse: the site doesn't accept anything you type in by hand! If you haven't made a menu selection, you get an error message! I'm sure that does wonders for their overall accessibility and section 508 compliance..

The error message complains that they refer to a variable "i" that doesn't exist. Somewhat wrapped, the code looks like this:

a.Autocompleter.Cache=function(c){
var f={};
var d=0;
function h(k,j){
if(!c.matchCase){
k=k.toLowerCase();
}
if(!k.startsWith(j)&&k.indexOf(" "+j)===-1&&k.indexOf("("+j)===-1){
return false;
}
return i==0||c.matchContains;
}

The problem is the reference to i in the return statement. There is no variable "i" defined nearby, indeed not in the entire script file. That "i==0" looks like some dead code that isn't meant to be there anymore. But it works in other browsers, no?

If I load the page in Firefox and type javascript:alert(window.i) into the address bar it says "1", so in Firefox the site does somewhere create a global variable named i. The question is where?

Firebug can't - as far as I know - break when a variable is initialized or changed. As always, Fiddler comes to the rescue - setting a "HTTP breakpoint" after response and re-loading the site in Firefox lets me add some simple debug code:
Fiddler screen in breakpoint mode, debug code highlighted
window.__defineSetter__ ('i', function(){ try{ undefined() ; }catch(e){ console.log(e) ; } })

When I click Fiddler's "run to completion" button, an error message appears in Firebug's console pointing to this function:

function isFirefoxWMPPluginInstalled(){
var plugs=navigator.plugins;
for(i=0;
i<plugs.length;
i++){

which given its name is naturally called after some browser sniffing, here:
type:$.browser.mozilla&&isFirefoxWMPPluginInstalled()?"application/x-ms-wmp":"application/x-mplayer2"


So this works in other browsers by pure luck - because JavaScript scoping rules are such that when you don't use "var" keyword to declare variables, they will be global, and Fnnar's code contains numerous loops that use "i" as a counter and don't use var. If such a loop happens to run before you try booking, the site will work for you. As if we needed any more evidence that JavaScript scoping rules suck..

If we're going to site patch this error in browser.js, the patch would be simply var i;. At 6 characters, I'm fairly sure it would be the shortest site patch ever. Meanwhile, we'll contact them and hope Fnnar will get their "i"s back in order.

And I sure hope they deploy better software for their autopilot than their autocomplete...

New adventures in compatibility testing

, , ,

I'm having some fun trying to figure out how sites use document.getElementsByName(), and thought some of you might be interested in the testing approach.

The bug I'm investigating is a small and ugly one hiding in the document.getElementsByName() implementation - getElementsByName('someID') will find an element with id="someID".

This is of course bad behaviour. That method has nothing to do with IDs and should find elements by name only.

The good news is that it's trivial to fix. The bad news is that it's there for a reason, and the reason is called Internet Explorer. We've been bug-compatible on purpose and while we'd like to remove the bug we have no idea how many sites will break if we do!

So, I'd like an answer to questions like these:
  • How many sites use getElementsByName() to find elements with an ID?
  • Do these sites break if we fix the bug?
  • Do they have alternate code paths for browsers doing it right? If yes, how do they figure out what code to use?

Tools at our disposal: the MAMA web code search engine (an internal Opera project), User JavaScript, and two ad-hoc Opera Unite services.

MAMA tracks sites that might be using document.getElementsByName(). It knows about roughly 45 000 sites where it has seen the string "getElementsByName" in script source code, and it generously provides 5000 random ones in a text file on my request. Naturally, MAMA does only static analysis of the scripts, it can't tell whether the method is actually called or what it was used for.

That information is a piece of cake to get with User JavaScript. A trivial custom script, trackGEBNabuse.js, overwrites the getElementsByName() method with one that will do a bit of debugging and logging on our behalf. And I'm playing with Opera Unite for the first time, with one logging service and one URL player that keeps track of which of the 5000 URLs were already visited and sends Opera to the next one.

(Opera Unite actually rocks! It's fun to write backend-type logic in JavaScript rather than PHP, and it's less hassle while developing to keep all the information, URL lists, log files and scripts locally on the hard drive. I've been undecided about Unite, not sure if it was more important than all the other things we should be spending time on - now I see it's maturing and making itself useful. Nice.)

To walk you through the main logic of things - here's the user JS that overwrites the native method to do logging - commented:

(function(gebn){/* "gebn" is a reference to the actual, native function */
document.getElementsByName=function(name){ /* overwrite the real one */
var elementList=gebn.apply(this, arguments); /* call the native function, record the list it returns */
/* we want to know if anything in the elementList is there due to a matching id rather than a matching name */
var abuse=[];
for(var i=0,elm;elm=elementList[i];i++){ /* go through all returned elements */
if( elm.getAttribute('name')!==name )abuse.push(elm.outerHTML); /* we found one that's probably in the list because of an ID attribute! */
}
if(abuse.length>0){
/* log errors to some server... */
(new Image()).src='http://hr-opera.hallvors.operaunite.com/logger/logGEBN/?data='+encodeURIComponent(abuse.join(', '))+'&href='+encodeURIComponent(location.href);
}
return elementList;/* don't forget to return the list of elements to the waiting script */
}
})(document.getElementsByName); /* this is where we pass the real method as an argument to the function */

As you see, it uses the oldest trick in the book - new Image() - to ping the Unite service with some data. The data is then stored in the folder I told Opera Unite to use when installing the widget.

The only other interesting part is the code that requests the next URL from the URL player - as trivial as doing this from a load event listener:
if(location.hostname!='hr-opera.hallvors.operaunite.com')
setTimeout( function(){ location.href='http://hr-opera.hallvors.operaunite.com/urldriver/nexturl?'+Math.random(); }, 500 );

The urldriver service also accepts the "urllist=somefile.txt" query string argument, so a different user scripts could play URLs from a different file (though not at the same time since the index of what URL one has reached is not stored per-file. That's obviously a bug in my Unite service - keep in mind that these are ad-hoc throwaway services done in 30 minutes of cutting, pasting and typing last night, so don't expect QA and polish :-p).

And the results? Left an Opera 10.10 instance to surf on its own in 5 different tabs overnight, which generated this log file listing 6 unique sites and the HTML of the elements returned in response to a getElementsByName() call due to this bug. Analysing 6 out of 5000 URLs manually is certainly doable :smile:. I'm still worried about getElementsByName() usage that only happens during user interaction, but now at least we know that 0.12% of the sites out there might be at risk from any change and we have some real code to look at. And automated analysis of websites is a new and interesting use case for User JavaScript.

Onestat.com's browser sniffer older than my son

, , ,

Seems the boss has been browsing web stats again - I came across a bug report from him saying a DHTML menu on onestat.com only works once. (Load page, click any "demo" link at the bottom and try the menu on the left.)

I'm not paying that much attention to stats myself, being too busy doing the work that will hopefully smoothen out compatibility problems and make it possible to grow our usage share. It's exactly the type of brokenness Jon found on onestat.com I fear the most - a bit hidden away, subtle, annoying when the user needs that page to work but perhaps not disastrous enough to notify us of. Too many of these issues, and we've lost a user! (Unless, of course, the user happens to be our CEO :smile: - luckily he's an active surfer who comes across more compat problems than any average user and eagerly reports them :wink:).

Now, deep inside the Onestat scripts is some code to handle browser differences. The script assumes that a browser below a certain level of DOM support can't be expected to create menus after the document loaded. That's of course not a bad assumption to make - until you see the sniffing they use to determine whether you have sufficient DOM support. Here is the relevant part:
if(kh.indexOf("Opera/7")>-1||kh.indexOf("Opera 7")>-1)return "Op7"; 
. 
. 
else if(kh.indexOf("Opera")>-1)return "Default";


Check the calendar, sir: we're in late 2009. Opera 7 - the only Opera version this script thinks is capable of opening the menu a second time - was released in January 2003. That's a 6 year old browser sniffer! I hope the script that generates their statistics is a bit more up to date!

And it's more than a little ironic that a web statistics company contributes to Opera's low usage share by running such old and broken code on their website. :frown:

browser.js update: eBay, Sun webmail, Salesforce

A new browser.js file is out, and like last time I'll post a changeblog with some background information. (I might spin these posts off to a separate blog at some point but for now it's here.)

First, some headlines:
  • Sun System Messenger Express webmail fix.
  • eBay.fr will no longer hang
  • Finding a doctor in South Carolina becomes easier, thanks to fearphage
  • Removed patch for maps.live.com - reborn at Bing maps
  • Conflict between Salesforce and WebForms2


Sun System Messenger Express webmail fix.



Opera 10 aligns the policy for setting document.domain with the other browsers, and require both pages to set document.domain before allowing communication. (In other words, if www.example.com wants to talk to example.com both of them must set document.domain to 'example.com'. In the old implementation, only the content from www.example.com would have to do so).

Normally, aligning with the other browsers shouldn't cause problems - but unfortunately, I've noticed that certain sites use browser sniffing before setting document.domain! I have no idea why.

For example, Facebook uses this oddity to avoid setting document.domain in certain older Firefox versions:

if (navigator && navigator.userAgent && document.domain.toLowerCase().match(/(^|\.)facebook\..*/) && !(parseInt((/Gecko\/([0-9]+)/.exec(navigator.userAgent) || []).pop(), 10) <= 20060508)) { document.domain = window.location.hostname.replace(/^.*(facebook\..*)$/i, '$1'); }


I wish I knew why it's such a bad idea to set document.domain in Firefox-versions released before May 8, 2006.. :sherlock:

Similarly, a webmail suite from Sun which is used by big universities and such contains this sniffing:

var agt=navigator.userAgent.toLowerCase();

var is_nav  = ((agt.indexOf('mozilla')!=-1) && (agt.indexOf('spoofer')==-1)
            && (agt.indexOf('compatible') == -1) && (agt.indexOf('opera')==-1)
            && (agt.indexOf('webtv')==-1) && (agt.indexOf('hotjava')==-1));
var is_gecko = (agt.indexOf('gecko') != -1);

if(is_nav || is_gecko) document.domain = document.domain
//document.domain = document.domain


and they run right into the new security policy when the sniffing means they avoid setting document.domain and all sorts of things break.

To solve this, each time we're about to run a script named setdomain.js or load a file named sample_lr.html - on any website - Opera will first append "Gecko" to its navigator.userAgent string. Oh, the weird and whacky things you need to do for compatibility.

eBay.fr will no longer hang



eBay.fr uses the SELECT.remove() method, but with element nodes instead of numbers as arguments. Their spec violation, our problem. This will be fixed in core, to allow using remove() for OPTION nodes like other browsers do - meanwhile, browser.js will sort it out for eBay.

Finding a doctor in South Carolina becomes easier


Do I have any readers in South Carolina? Thanks to fearphage's neat emulation of IE's bugfeature which lets you find a named form element with document.getElementById(), you can now use Blue Cross Doctor and Hospital Finder even though it relies on IE's violations against the DOM standard. The patch has been waiting while we were trying to get through to someone at Blue Cross Blue Shield who might be able to fix it, time is up - and big thanks to fearphage!

Removed patch for maps.live.com



Now maps.live is Bing Maps. No point in applying patches for the old hostname anymore.

Conflict between Salesforce and WebForms2 / HTML5



A report wizard inside Salesforce.com fails in Opera because of our support for the data attribute from WebForms2.

When they do

<select id="typeSelector" name="type" onchange="fillSelectFromArray(document.report_select.rep, ((this.selectedIndex == -1) ? null : data[document.getElementById('typeSelector').selectedIndex]));" title="Report Type Category"> 


"data" doesn't refer to the global variable data but to the data property on the SELECT object due to the scope of the event handler.

This attribute has since been removed from HTML5 so Opera will drop it at some point. Meanwhile, another stopgap site patch makes Salesforce work.

The 9sky.com fix in the previous edition was also about a problem caused by HTML5. Standards development and experimental implementations is obviously a major compatibility risk. If we want to improve the web's underlying technologies there isn't any other way forward, though.

browser.js updates: Hotmail, Tuenti, AOL Webmail

,

Quick overview of browser.js updates during the last couple of weeks.

Read more...

Most expensive javascript ever?

, , ,

I've wanted to tell this story for a while, and I don't think I'm spilling any beans or disclosing any sensitive information at this point.

So, a while ago Opera Software needed more servers. Not just a few servers either - we were planning Opera Mini's growth, implementing Opera Link, and My Opera was also growing quickly. We predicted crazy server load increases for the foreseeable future (and man, were we right!)

Clearly we needed to make a massive investment on the server capacity front (basically buying these shiny things and then some.)

Management put a hefty check on the table - I'm sure our beloved sysadmins felt like kids before Christmas - and salivating sales people from major hardware vendors grabbed our requirements spec, dived into their CRMs and crunched their spreadsheets. They emerged with offers and sample servers shipped all the way to Oslo for our testing pleasure.

However, one of the world's biggest hardware vendors - whose name every single reader will be familiar with, and whose hardware a good share of you will be using right now - apparently didn't do their homework. When Opera's sysadmin booted up the server to test its web-based administration interface, they came across a single JavaScript statement that managed to piss off everyone up to and including the CTO.

This single statement, apparently written by some sub-contractor they had outsourced admin interface programming to, cost them millions of NOK in lost sales.

And the code they sent all the way to Oslo for testing? Here's an extract:

if (is.opera)
{
window.location.href="config/error.htm";
}

Facebook monitors your alert() usage

,

If you use a bookmarklet on Facebook and it calls window.alert(), it doesn't quite do what you expect. They've re-defined the entire alert() method - it will pop up a box, but it will also behind the scenes send what you tried to pop up to the server!?! Look at Facebook's alert code (shown in an appropriate setting, of course):



Since I routinely use alert() for debugging, should I be paranoid now?
I really wonder what this feature is intended for and whether they actually harvest this data and use it for anything.