When software does not work for me I always try to fix it. With Opera this is little hard as it's closed source so I can't hack around easily as I would be able with Safari or Firefox.
But as latest Opera build (10005) broke one of the most important features for me (namely XPath) I had to do something, even though I knew that it won't be easy. So it was time to reach for low-level tools.(right click image and click Open to see it in it's full size)
Setting up a workplace...
On the beginning Run button has to be pressed. Opera starts but we are now inside opera.exe module which is not really what we want. Real Opera code is inside opera.dll and we have to get there which is not a problem thanks to OllyDbg "Executables modules" window. After we run Opera, this window will be filled with our opera.dll module that we can double click to go to this dll's entry point.
Now one has to think about a way to hook into the bug. This part requires some thinking (like other parts don't
). What we know about the bug? Doing document.selectNodes('//div') does nothing. There is almost no thing to hook into (like a big alert box with message "I'm broken, come see why"). Almost... there is one error message in Console that says something about TYPE_ERR exception. OK, this should be enough to trace back into the cause of the problem.
TYPE_ERR is a string hardcoded inside opera.dll. I've chosen quite simple way to get near the code that generates this message. Strings and other read only data are inside .rdata section of executable. They can also be inside .resources section but as Opera is not limited to Windows, it does not really use resources for anything besides some simple forms and icons.
So we should look for our string inside .rdata section. But how to get to this place in memory? OllyDbg comes to the rescue again, we just need to open "Memory map" window and look further down for our Opera_1 module (opera.dll). There we see its .rdata section that we can doubleclick onto.
Window full of data opens. We use CTRL+B to search for our TYPE_ERR string. First match looks like our target but it's actually SPECIFIED_EVENT_TYPE_ERR which can be seen by scrolling one line up. Next one (CTRL+L) is also not the one we want but eventually we'll find what we are looking for after n-th try.
We found the string that is used in error message but what can we do with it and how to find the code that generates the message? The trick is to use Hardware Breakpoint which can be set for almost any memory area and will stop program execution whenever it tries to read/write this part of memory.
So right click first byte of this string and from menu set Hardware breakpoint on access.BTW. We can't just search disassembled code for references of this memory address as Opera does not directly reference strings by theirs memory addresses.
Now it's necessary to trigger the TYPE_ERR error in Opera. We enter some site that have divs, put this code:
inside address field and press enter.
Debugger triggered. After a bit of thinking and looking at the code I assumed that I am somewhere in the code that generates XPathException object.
Current stack state seemed to prove this assumption as there was this a lot saying string (XPathException) a bit down When debugging unknown code you have to "assume" a lot of things and/or follow your intuition as 95% of the code you see, you probably won't understand or even have to understand.
Where should we go from here? We are inside code that probably builds exception object. It means that Opera already decided (wrongly) that our call to selectNodes is invalid and is about to throw an exception. We have to look before that, to find the spot that decides if exception should be made.
We can use CTRL+F9 to move to the "ret" instruction which returns us from the calls and moves back to the code that invoked the call. We do that few times until stack window shows our XPathException string at the top. We should land right after the call call that creates (assumed) exception:
Next part got a little monotonous and I don't really remember what exactly I have done. But generally I was running two Opera debugging instances (broken build and one before that was fine), stepping code manually and paying attention to calls/jumps that were made in one and not the other.
Eventually I've stumbled upon a call that was returning FFFFFFFF (which means null) in EAX register in broken build and 00000000 in working build. This obviously was the call that needed more thorough analyze.Calls like Call [eax+45] are rather tricky to debug using dead listing approach (with IDA decompiler for example) as you won't know what eax register points too. OllyDbg FTW!!
Tracing into this function revealed quite simple looking code:
Again I have used my intuition (not counting trace logs, lots of memory BP, seeing how other build works...) that this piece of code compares 0 with number of elements matched by our xpath query (which should be known by now). The problem was that in broken build, number of elements matched was "0" so jump WAS taken while in working build it was not jumping.
I had to look earlier for code that sets or erases this memory location. I figured out that even broken build gets this number right earlier in the code but somehow "forgets" it along the way. Process of stepping line by line, setting hardware breakpoints on memory (offsets were different every time xpath query was rerun which make it harder to trace) led me to this code:
So what happens here?
First line: EDI has our "magic" number of matched elements. This value is copied to some place in memory but we don't care about that.
Then value in EDI is binary shifted left by 2 which results in
value*(2^2). This value is used in a call that I don't really need to know what it does.
After call there is a TEST esi,esi which is probably checking if our "matched elements" value is different then 0. But ESI is not really defined anywhere near and as such it does not really look like a right value to be checked
Working build had a bit different code here. It copied value of EDI to ESI before doing binary shifting on EDI, so when testing ESI it actually had proper number of matched elements and jump was not executed (as expected).
And then it proceeds to enumeration of all matched elements and probably builds NodeList object...
So to fix the bug I have chosen to replace TEST ESI,ESI with TEST EDI,EDI because EDI is bound to have some non-0 value when elements are matched and all plays well. Patch could be less hackish if there would be two bytes of space available for additional command but it was not an option.
To fix build, one have to unpack it with UPX
using command: "upx -d opera.dll"
and then replace these bytes:
83 C4 0C 85 F6 76 51 83 65 FC 00
83 C4 0C 85 FF 76 51 83 65 FC 00
(actually only one byte is replaced)
For hex editing one can try XVI32