Skip navigation.

miscoded

the web is a hack

Posts tagged with "yahoo"

Y!Mail: getting somewhere

,

Short status update: They've been busy at Yahoo moving accounts from Yahoo Mail Beta to Yahoo Mail 1.0 - all but one of my usual test accounts have suddenly lost the "Beta" text from the icons. Today's weekly build isn't bad either, after a couple of weeks where Y!Mail seemed to hit every crash bug in the internal builds, it's finally stabilised and runs pretty smoothly.

On Monday (or possibly even earlier) a new browser.js will go public for 9.5, and it should resolve most of the issues that currently prevent Y!Mail from loading in the new weekly build. Sorry to keep you waiting but I was working later than my server admin colleague tonight..

This time, to get Y!Mail working I had to remove several patches. Since our shiny, improved getters and setters support enables the Y!Mail compatibility layers to do their job, patching should hardly be required anymore...

Actually, the current version of browser.js messes it all up and gets in the way - several of the errors you see if you try to load Y!Mail right now is from browser.js code where the old patches are incompatible with the new getters and setters stuff!

This is great news. Removing patches is good. And I've never seen Y!Mail work better with Opera than Y!Mail 1.0 and 9.5 9594 with the new browser.js - even chat works nicely! You Yahoo mail users should have something to look forward to :smile:

the backtrace of an Y!Mail debug session

, , ,

Advance warning: this blog post dives into the most complex bits of Yahoo Mail beta. Along the road I'll explain a few tools, tips and tricks I rely on when analysing problems. I'm afraid you may not understand much of this post without prior knowledge of JavaScript. Feel free to read it anyway, but you've been warned!



The other day I claimed to know why Y!Mail was unusable in Kestrel, but even as I posted that I had a feeling I didn't have the full story yet. Yes, I had found some incompatibility between Opera and Firefox that DID trip things up and made confusing "Generic error" messages appear. I even found the Mozilla bug report saying Moz's behaviour was a bug (and by extension that Opera's was correct) - but still, this seemed more like a symptom than an actual cause. Because even if Firefox didn't throw exceptions, how could Y!Mail get away with sending us so much invalid XML in the first place? Some of that XML contained information that WAS available inside Y!Mail when running in Firefox. Was Y!Mail's XML invalid only in Opera? If so, why?

So next day I knew understanding this issue was going to be top priority, no matter how much I'd like to procrastinate with simpler bugs before diving into Y!Mail source code.



For a background, I'll recap the initial debugging from the previous day. The problem was the "Generic error" message on the screenshot above on the right, which appeared on Y!Mail load and caused the whole Y!Mail to be unusable.

First step with code like this is always making the all-on-one-line source more readable with one of the wrap scripts available. Luckily it is very simple to make Opera use the wrapped source instead of the original: open the script's URL in a new tab, view source, copy it and do the wrapping, paste it back to the editor, save and use Opera's "Reload from cache" setting.

The reload from cache menu entry may not look very interesting, but it is currently your best friend when debugging problems in Opera. It allows you to insert any sort of debug code from your source editor and study the output almost instantly. That is very useful. (It does have bugs though, IFRAME contents is usually re-validated from server, and scripts may also be - I guess some part of Opera respects expiry and cache-control headers even for "reload from cache". It usually works but I have to keep an eye on the HTTP requests just in case.)

However, before you can insert any debug code, you have to find the part of the JavaScript that contains the problem. With something as complex as Yahoo Mail, that's not easy. The error message doesn't make it easier: just "Generic error" without a backtrace - hey Opera, we're old friends but I know your shortcomings in the debugging department better than any griping, Firebug-spoiled web developer. That error is so unbelievably unhelpful, did you know?

For searching through code and many other debug tasks, I highly recommend the HTTP debugger Microsoft Fiddler. If you do any sort of web application or web browser development, this tool is a must-have. Below is a screenshot of Fiddler showing parts of the traffic that occurs during a Y!Mail session, the selected session is from an XMLHttpRequest and you'll see how Fiddler shows you the markup that was sent to the server (top right pane) and the markup that was returned (bottom right):



As you see, Fiddler keeps a record of all the code a website loads or sends - and you can search through all that code at once! To attack a problem like this I might try to search for a part of the error message:


The response seems promising:


The files that contained this message are now highlighted, so I can pick one of them and use the inline find field in the source pane to find the actual location:


Here's a match:

function launchWindowEventHandler(id, msg)
{

switch( id )
{
case "login_error":
alert( "Yahoo! Mail Beta experienced a login error: " + msg);
cancelLaunch();
break;


To see where this function is called from, a quick cheat might be to edit the source and call a non-existing function just inside it to check the stack trace in the error console. Then work backwards..

Actually, the patch itself came to the rescue and helped me find the error. To quickly add debug code to the patch I just used Opera's opera:config to turn off browser.js and make Opera read the browser.js file as a user JavaScript instead. When the signature check no longer applies I can modify the file:


User JavaScript can be really helpful when debugging. They partly make up the lack of a proper JavaScript debugger: you can replace specific functions with your own modified versions and do a lot of cool, custom stuff. In this case I happened to base my debug user script on browser.js.

Playing around with the "loadXML" section of the patch I noticed that DOMParser sometimes threw "Generic" errors: aha, it's Opera's way of saying "some very specific error happened when parsing that string you pretended was XML, but I can't be bothered telling you which one".

This much I knew when resuming the analysis the next day. Some searching gave me an overview of the places where Yahoo used the DOMParser to parse XML. What next?

In my experience, you can find shortcuts and resolve issues faster if you start writing tests when you start getting an overview of what a script is doing. When I think I know what's going on, I write a test that covers what I think happens - if I'm right, I've saved time by not exploring the full details. If I'm wrong, I'll have to dig around more. At this point I felt it was test case time..

I first decided to get a copy of all the "XML" Y!Mail tried parsing and compare Opera's and Firefox's output. Again I added debug code to the patch to harvest all the XML (note the postError call):

// 194334, Y!Mail faking IE's loadXML method
HTMLElement.prototype.loadXML = function (xml) { opera.postError(xml);

I copied all that XML to a single test file and checked each parse operation for three types of results: success, exception or error document.
Here is the first test case. (Slightly modified to avoid spam: I've replaced all E-mail addresses in the markup with a bogus one).

It shows that Firefox and Opera are exactly equivalent. Firefox creates "parseerror" documents where Opera throws. They agree entirely that some of the XML Opera tries to parse is invalid. So that experiement gave me no new information :frown:

Where exactly is the error thrown? Time to add some try...catch and lots of debug information to every place DOMParser is used. For example here:
O.__defineGetter_XMLDocument{
return;

}
if(this._XMLDocument==null){
try{
this._XMLDocument=(new DOMParser()).parseFromString(this.innerHTML,"text/xml");
}catch(e){ 

opera.postError( 'died in XMLDocument getter' ); 

}
}
return this._XMLDocument;

}
);


There is a pretty interesting observation to be made from the code above: what they try to parse here is the innerHTML of an element!. That's strange.. so the element already has an HTML DOM, but they want to serialize it with .innerHTML just to immediately parse it again??

Well.. In my line of work you learn to simply stop expecting things to make sense... Expecting sense from a typical piece of the JavaScript out there is a prejudice that can only mislead you.

What next? Well, let's focus on a single piece of apparently invalid XML. Let's pick the <subjects> one. Opera and Firefox agree that what Y!Mail asked Opera to parse is invalid XML, and it is due to the ampersand that occurs here:
Leather & Tweed
. In valid XML you can't have a sole ampersand. If you need one it must be written &amp;.

So, a theory combining the information we have now: maybe Opera sometimes outputs & and not &amp; when you read innerHTML? That might cause such problems for sure! So, let's whip up another quick test case...

let's see..

Again, theory is invalidated. Firefox and Opera sing perfectly from the same sheet here.

Where does the <subjects> markup come from? That I actually knew from random source browsing, but Fiddler would have found it easily: The localised part (url will likely change) of the Y!Mail libraries contained this code

var gXmlSubjectOMatique='<xml id="subjectOMatique"><subjects><p>Welcome to SPACE X</p><p>Astonishing feats of MENTALISM!</p>...

and tons of other funny subjects you might choose for your E-mails. And - wait - the problematic ampersand is indeed written &amp; here. Even in the source Opera gets. Somewhere that amp turns into an un-escaped &. But where and how?

Back to Fiddler to work out where Y!Mail uses the variable gXmlSubjectOMatique (finally a unique variable name it makes sense to search for - thanks Yahoo). It's here:
function w9(){var O=['<div style="display:none;">',j2(gXmlBranding+gXmlSubjectOMatique+gXmlLocaleSettings)

The w9 function just returns all that markup joined up. Where is it called from?
var O=w9();document.body.insertAdjacentHTML("afterBegin",O);appOnLoadHandler()

Right, so lots of "XML" markup is added to the HTML document with insertAdjacentHTML before Y!Mail calls the appOnLoadHandler which apparently at some point will run through those XML elements, read out their .innerHTML and send it to DOMParser. So, are ampersands inserted with inserAdjacentHTML for SOME reason treated differently in Opera?
No, they aren't. Another quick test case barking up the wrong tree.

Back to the w9 function: if you notice, it actually calls another function on those three strings. A function called j2. Wonder what it does?


Here's another quick trick: rather than searching through the code for a function declaration, if a function looks like it is visible from the global scope I usually use a bookmarklet to quickly check it out. So, back to the Y!Mail window and javascript:alert(j2); - now, what's all that about??

This is very odd.

First it just decides to do nothing if a variable is true.
  if (jo.O)
    {
      return L;
    }

I'll bet that there is some sort of browser sniffing behind that if. This really smells like a workaround for a problem in a specific browser..

The actual work going on in this function is here:

  var D = new RegExp("<xml(.*?)>(.*)");
  for (var O = 0;O < T.length;O++)
    {
      T[O] = T[O].replace(D, "<xml$1><!--$2-->");
    }

and it does.. it does.. eh?? looks like it wraps everything inside the XML element inside a comment tag?!? Let's see if that's right..

javascript:alert(j2( gXmlSubjectOMatique ));.
THERE YOU GO! It outputs a string starting <xml> with innerHTML, it outputs .

This breaks Y!Mail because they take a string of perfectly fine XML, wrap it in HTML comment tags, put it inside an <XML> tag, add it inside the BODY tag with that horrendous IE thingy called insertAdjacentHTML, serialize the DOM of said tags by reading .innerHTML, strip away the comment tags again and send the string to the DOMParser.

I can think of a few simpler, safer ways to parse XML. But apparently the "keep it simple, stupid" principle does not apply when adding compatibility layers on top of existing IE-only web apps like the Yahoo mail team has done with the webmail formerly known as Oddpost. And as usual, nothing is harder than being compatible with the workarounds against other browsers' bugs :frown: .

Kestrel and Y!Mail

, ,


So Kestrel betas are flying out and high on the list of changes is a big, black important bullet point saying "improved site compatibility".

So, I can hear you asking: why is Yahoo mail beta useless in the Kestrel preview? In fact, it is usable in 9.23 but completely broken in 9.5 alpha..

Patience, please. Precisely because some of the new features like getters and setters that are meant to improve compatibility when we finish them are work in progress, Y!Mail is currently rather broken. It won't take us many weekly builds to get things sorted.

There are two big-ish issues, one is that getters and setters don't work when defined on DOM prototype objects and the other is Mozilla's bug 45566 which we've "fixed" in Opera and it turns out Y!Mail really REALLY prefers the broken Firefox implementation. I have a feeling they'll never be able to fix that bug, they'll have to label it a feature in the end.. so we had better be bugfeature compatible. :rolleyes: Naturally, DOMParser is their non-standardised extension so they can do what they want and we have to reverse engineer and obey.

Launching OddPatch 0.1 Beta

, , , ...


Launching in browser.js tomorrow or so: OddPatch beta, solving many of the compatibility issues with the service formerly known as OddPost, now Yahoo!Mail beta.

It's a complex patch sorting out a daunting number of issues on both sides. When things get this complex, it's rarely only "their fault" or "our fault" - the testing has uncovered several bugs in Opera and several mistakes in their JavaScript. Here's a walk-through of the entire patch:

// browser sniffing workaround - walking in through the back door
if( location.href.indexOf( '/dc/system_requirements?browser=blocked' ) >-1){
location.href='/dc/launch?sysreq=ignore';
}


Y!Mail has three modes for browsers: supported, possibly working, or blocked. Opera is blocked, but luckily there is a backdoor that will bypass the sniffing.

The irony here is that when they block us, they are making their work on Opera-compatibility much harder than necessary. If we get access, we'll do our best to make things work: test, find bugs, even decide to support things we haven't supported previously (we'll have selectSingleNode soon because Y!Mail uses it heavily, and their code was also a very important reason why DOM2 Style support was prioritised for Opera 9.0!). Blocking us makes it much harder for us to make their life simpler.

if( top.location.href.indexOf('/dc/launch')>-1 ){ 
// Gecko compatibility library uses defineGetter and defineSetter. We need to fake them.
//* Patch below is required but causes trouble.. 
Object.prototype.__defineGetter__=      function(){}
Object.prototype.__defineSetter__=      function(){}


This is known as "fake it until you make it". We won't have getters and setters anytime soon, but things will work anyway if we pretend we do.

// IEism called loadXML, basically a DOMParser / DOMLS equivalent
// must handle XML fragments without root element!
Element.prototype.loadXML=function(s){  
try{
var d=new DOMParser().parseFromString(s, 'text/xml'); 


This, I think, is a bit of the strange world IE lets you into if you put an XML tag in a page. That tag takes on a life of its own and starts behaving in some contexts like a document, it aquires several methods and properties - and though I at first thought I could simply fake it with a DOMParser I had to think again because...
}catch(e){ // DOMParser could not parse fragment, probably because of missing single root element. Workaround time..
var d=document.implementation.createDocument('', this.tagName, null), el=d.createElement('el');
//?? why did I use this.tagName there?
el.innerHTML=s;
for(var i=0 ; i<el.childNodes.length;i++){
d.appendChild(el.childNodes[i].cloneNode(true));
}
}


Yes, they are not always playing with well-formed XML fragments. Oh well, we'll pull out good-old-tagsoup-parsing .innerHTML and eat their strings anyway. Then we move on and fill in some other required bits and pieces of IE's XML DOM. Right now I'm not sure if all the stuff in this block is required, but there it is.. I'll be the first to admit that both code and comments are evidence of the somewhat chaotic process of late-night patching..

// faking IE-style XML element DOM - separate documents with documentElement within the main doc's DOM
this.documentElement=d.documentElement||d.firstChild; 
//?? firstChild is probably leftover from earlier versions using documentFragment?
this.XMLDocument=d;
// address book loading checks .parseError.errorCode
this.XMLDocument.parseError={ 'errorCode':0 };
return d;
}


And then is a peculiar mystery, who would ever need a function called isSameNode when == would presumably do the job?

// some method called isSameNode is called. Not sure where it comes from but simple enough to fake..
Element.prototype.isSameNode = function(n){
return n===this;
}


Here we go, more delicacies from IE's internals: the handy XPath method selectSingleNode. I have quite some reservations against loadXML and the other XML DOM stuff above, but selectSingleNode should be written into a standard as soon as possible because document.evaluate needs too many arguments for lazy JS coders and the returned object is too fiddly too.

// selectSingleNode support
var realSelectSingleNode=function( expr, resolver ){
var result=(this.ownerDocument?this.ownerDocument:this).evaluate( expr+'[1]', this, resolver, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,null );
return ( result && result.snapshotLength ) ? result.snapshotItem(0) : null;
}
Node.prototype.selectSingleNode = function (expr, resolver)
{ 
  if (!resolver)
    if (this.nodeType == Node.DOCUMENT_NODE){
      resolver = document.createNSResolver (this.documentElement);
    }else if(this.nodeType == Node.ELEMENT_NODE && this.ownerDocument && this.ownerDocument.documentElement ){
      resolver = document.createNSResolver (this.ownerDocument.documentElement);
    }else{
resolver = document.createNSResolver (this);
    }
  return realSelectSingleNode.apply (this, [expr, resolver]);
}


Now, up till now the patching has been quite ordered. A patch is a bit of a kludge anyway, but so far it's been a nice kludge. Here come the problems that were too ugly for a nice kludge. Certain issues just required search and replace operations on the script source code, nothing else to do about them - and because Yahoo source is compressed and variable names random, search and replace must take that into account and go for seriously complicated and ugly regular expressions.

opera.addEventListener('BeforeScript', function(e){
// This is the riskiest patch
// Fixing typo: missing ' after attribute value
e.element.text=e.element.text.replace( /\):\(">"\)\),/, "):(\"'>\"))," );
e.element.text=e.element.text.replace( / id='_test_add_folder>/, " id='_test_add_folder'>" );


Yep, twice in their source code they say things like <tag class="foo> omitting the closing quote. That caused Opera to parse it as a text node instead of an element, meaning source code would appear here and there in the interface. Oops.

// WebForms2 problem: button attribute "action" is a URL in WF2
e.element.text=e.element.text.replace( /\.(action\b)/g, ".js$1" );


Specs and implementations collide again: if you set input.action to a value like 'markAsSpam' the WF2 spec means it will be resolved as a URL, so when the script reads it again it will see 'http://mail.yahoo.com/dc/markAsSpam' which is not at all what it expected.

// send button not working - attribute nodes must be in the document they will be used
e.element.text=e.element.text.replace( /(\w)\.setAttributeNode\((\w)\)/, "$1.setAttributeNode($1.ownerDocument.importNode($2, true))" );


Hm, is Firefox sloppy with exceptions on cross-document node usage again?

// workaround for getting the documentElement.xml  markup
e.element.text=e.element.text.replace( /(([\w\.]*)documentElement).xml/g, "(document.implementation.createLSSerializer()).writeToString($1)" );


IE's XML DOM rides again. Elements over there have an .xml property which is basically the equivalent of .innerHTML for an HTML element, showing the inner serialised markup.

I didn't take a long and hard look at how Y!Mail used it but some of that seemed very weird. I had the impression that they read .xml of the contacts list only to pass it around as a string and use loadXML later on. Why would they serialize markup just to parse it right into a DOM tree again? Oh well, there is probably some complex reason..

// for...in on objects run into our faked __defineGetter__ and __defineSetter__
// we try to add an exception to any for...in loops
e.element.text=e.element.text.replace( /(for\((var |)(\w*) in \w*\)\{)/g, "$1if($3.match(/^__define/))continue;" );


It turns out "fake it until you make it" wasn't such a good idea after all. The site called our bluff with code like
function foo(obj){
for( p in obj )return false; return true
}

var bar = {}; if(!foo(bar)) return;

and what exactly they meant by that I don't know either, except to check that a newly created object really REALLY had no properties. Huh?


// To: / CC: autocomplete fails
// we support IE's TEXTAREA.createTextRange but unfortunately not its boundingLeft property. Improving object detection..
if(e.element.src&&e.element.src.match(/ac\.js$/))e.element.text=e.element.text.replace( /if \( editCtrl\.createTextRange \)/, "if ( editCtrl.createTextRange  && editCtrl.createTextRange().boundingLeft )" );


This is a typical trap of piecemeal implementation of something: we support whatever of IE's stuff was deemed important to get some plaintext formatting JavaScript to work with 8.x. .boundingLeft wasn't on the list back then. Sorry. Look a bit harder when you look for something.

// Preferences not read correctly from XML attributes
// IE has an attribute node .text property. .nodeValue will work in Opera..
e.element.text=e.element.text.replace( /\.selectNodes\((\w*)\);\s*\}return\((\w*)\.length\)\?(\w*)\[0\]\.text:/g, ".selectNodes($1);}return($2.length)?$2[0].nodeValue:" );


IE again. text alias nodeValue, enough said.

// We throw an unwanted exception if both arguments to insertBefore are the same node
e.element.text=e.element.text.replace( /var (\w*)=(\w*)\?(\w*)\.nextSibling:(\w*)\.firstChild;\s*(\w*)\.insertBefore\((\w*),(\w*)\);/, "var $1=$2?$2.nextSibling:$4.firstChild;if($6!=$1) $4.insertBefore($6,$1);" );


Now this is plainly a bug. The DOM spec says we should throw an error if the node you insert is a parent of the reference child, but we also did so if the inserted node was the reference child itself.

The code doesn't make sense, mind you... Why do you want to replace an element with itself?

// Opera 9.00 and 9.01 has a bug that means createContextualFragment on table elements is unreliable
// easily the worst patch.. but then it works around a really tricky bug..
if( navigator.userAgent.indexOf('9.01')>-1 || navigator.userAgent.indexOf('9.00')>-1 ){ 
// UA detection to target specific bug in specific version is OK
e.element.text=e.element.text.replace( /(\b(\w*)\.selectNodeContents\((\w*)\);\s*var (\w*)=(\w*).createContextualFragment\((\w*)\))/, "if($3.tagName=='TBODY'||$3.tagName=='TR'){ $2.createContextualFragment=function(s){var n=s.match(/<(\\w*)/)[1]; var e=document.createElement('div');e.innerHTML='<table><tbody>'+s+'</tbody></table>';return e.getElementsByTagName(n)[0];  } }$1" ); 


Just as ugly as it looks, just an attempt to make the code work in 9.01.
Imagine if Yahoo!Mail blocked us until they one day decided that it was necessarily to start working on Opera compatibility? If we never had gotten to test their system, they would probably have to add such an ugly workaround to their application to get around this bug. Developers everywhere: please, don't sniff, just leave more of the burden of compatibility on the UA's table (and listen to feedback!).

Hey, we're done with the replacements! It wasn't pretty, and I look forward to deleting one by one while things are fixed on either side. That will also give us a nice performance lift. Their scripts are huge. At some point I did some profiling of the above replace calls and found that up to half of the time it took Opera to load Y!Mail was spent applying the above patches.

}
}, false)

// No scrollbars appear for message list..
// uses an "overflow" CSS property to control scrollbars. e.overflow="-moz-scrollbars-vertical", and some odd clipping as well..
document.addEventListener( 'load', function(){ setTimeout( function(){ var divs=document.getElementsByTagName('div');for(var i=0,div;div=divs[i];i++)if(div.className&&div.className.indexOf('fakeScrollBar')>-1){div.style.overflow='auto';div.style.clip='auto';}},500);}, false );


Yes. That problem. It uses some CSS I still haven't fully understood.. I think they wanted to show an element with a scroll bar but clip the whole element away so only the scroll bar would be visible.

// redraw problem hides To: field in compose screen
document.addEventListener( 'load', 
function(){ if(top.document.frames['newmessage']){
setTimeout( function(){try{top.document.frames['newmessage'].document.body.className+=' ';}catch(e){}},1000);
}}, true);


This is another tricky one, it has to do with timing and I still haven't quite captured the sequence of events. Basically the "To" and "Subject" fields in the compose screen disappear until you click "Show BCC".

// sluggish performance due to unintended event capture
(function(ael){ 
window.addEventListener = function(type, func, capture){ ael.call(window, type, func, false); }
})(window.addEventListener);


..and just to top if off, they had to capture events by mistake. Of course. That's from the curriculum of "How to code Opera-incompatible websites 101".

opera.postError( 'Yahoo mail patched' );
}
}


Yippee! We did it!

Now, this patch is (repeat after me) in beta! There are problems that I'm aware of but haven't fixed, and there are problems that I'm not aware of and haven't fixed. And while I was working on this stuff, code changes in Yahoo mail would break things again every few days (and even break differently on the U.S. and the U.K. sites!). So, hurry up and try it while you have a chance! I'll try to keep the patch maintained, and we sure hope that we'll get all the issues sorted out from either side for a fast, friendly, responsive experience - somewhere in the future..

Yahoooooooooooooooo!

, , ,


I have a picture of thousand words, it speaks volumes about the unjust web.

Watch the animated comparison between the code Opera gets from Yahoo Mail and the code IE gets. See the full dinner of HTML and JavaScript that is sent to IE versus the meagre crumbs and tap water they serve Opera. Yahooooo for blatant browser discrimination!