Opera Core Concerns

Official blog for Core developers at Opera

Carakan Revisited

A little more than a year has passed since we launched the Carakan project, aimed at drastically improving Opera's ECMAScript execution performance, and it's finally time for the first labs release of Opera with the Carakan ECMAScript engine.

What we set out to implement over a year ago was, as I described in a previous post about the Carakan project, a new cross-platform bytecode interpreter for a new register-based instruction set, a new internal object model with automatic classification and inline property caching, and machine code generation. All this we've done, and then some.

The new bytecode interpreter and new object model are cross-platform, meaning they will work on any hardware platform Opera is ported to. On their own, they already give a significant performance boost compared to Futhark, the engine used in all current versions of Opera. Running on a regular desktop computer, Carakan's bytecode interpreter is around 3.5 times faster than Futhark on the Sunspider benchmark, and on embedded systems running less powerful CPUs early testing shows it to have an even greater advantage over Futhark.

For optimum performance, however, machine code generation, or JIT, is the way to go, and this is where we have focused most of our optimization work. Carakan is equipped with a hot-spot detecting JIT compiler that generates machine code that performs all but the most complex operations directly without calling the bytecode interpreter. It employs a combination of compile-time static analysis of the program and runtime profiling in the bytecode engine to optimize the generated code, focusing in particular on code that does arithmetic calculations. It also performs function inlining, both of simple built-in functions such as Math.sqrt() and String.charCodeAt() and of functions implemented in the script. Currently the JIT compiler only supports generating 32 or 64 bit x86 machine code, but support for other architectures will be added in time, starting with the ARM architecture.

But this is not all we have done in the Carakan project. I'd like to also mention two other interesting improvements that we've implemented compared to Futhark: a divided garbage collected heap, and caching of compiled scripts.

Divided garbage collected heap

The ECMAScript language assumes the presence of a garbage collector that automatically reclaims memory occupied by objects that are no longer needed. Carakan's garbage collector is very similar to the one used in Futhark, a basic mark-and-sweep design; we've only done some smaller, but rather effective, tweaking of its performance. We have however drastically changed how we use the garbage collector. In Futhark, all memory allocated by the ECMAScript engine for scripts running in any tab was allocated from a single shared heap, and anytime the garbage collector needed to run to free up memory, it would traverse all allocated memory. The more open tabs there were, the more expensive would a garbage collection become.

In Carakan, we instead use many smaller heaps. Each document loaded in a tab, or in a frame or iframe inside another document, gets its own. Since scripts running in different documents can sometimes access each other's objects, we have support for merging two heaps into one, and for detecting when this is necessary. The advantage of this design is clear: with smaller heaps, each garbage collection is cheaper. And since we only need to run the garbage collector on heaps from which memory has been allocated, we automatically only traverse the memory of active heaps, and leave all other heaps alone. The end result is that it doesn't matter if there are 1 or 100 open tabs; when a script triggers a garbage collection, the cost is the same.

Cached compiled programs

An aspect of an ECMAScript engine that performance benchmarks often don't measure is the performance of the compiler. Compared to Futhark, the Carakan compiler is much more focused on analysing the program and generating code that will execute fast, and may therefore be slightly slower in some cases. This is a trade-off we've made willingly.

Instead of the very efficient compiler in Futhark, Carakan brings caching of compiled programs. In practice this means that whenever a script program is about to be compiled, whose source code is identical to that of some other program that was recently compiled, we reuse the previous output from the compiler and skip the compilation step entirely. This cache is quite effective in typical browsing scenarios where one loads page after page from the same site, such as different news articles from a news service, since each page often loads the same, sometimes very large, script library.

Plans for the future

Although we're nearing the release of the Carakan engine, we don't plan to stop development of it. We have plenty of ideas on smaller and larger improvements to make, and we will also port the JIT compiler to other CPU architectures.

One area where we believe we can improve greatly is in memory usage, by switching to a much more efficient object representation. Carakan will already today use less memory than Futhark in some cases, by sharing information between similar objects via the automatic object classification system and by sharing literal data using a copy-on-write scheme, but we have plans that would reduce the size of ECMAScript objects to as little as a tenth of their current size.

We will also be looking at improving the performance of machine code generated for non-arithmetic code such as property accesses, where our JIT compiler currently produces significantly less stream-lined code than it does for arithmetic calculations.

Native JSON support in Opera(re-)Introducing <video>

Comments

Ken Rushiakrushia Tuesday, December 22, 2009 7:56:52 AM

Wow.

Øzikzakatak Tuesday, December 22, 2009 7:59:04 AM

thanx a lot for this info...

Tuttle Tuesday, December 22, 2009 8:10:21 AM

Sounds nice especially for the "Cached compiled programs".

d4rkn1ght Tuesday, December 22, 2009 8:31:50 AM

cool

Ian Doyletalam Tuesday, December 22, 2009 8:45:56 AM

I saw "and it's finally time for the first labs release of Opera with the Carakan ECMAScript engine." And got way too excited...thought we might get an early Christmas present. smile

RoyiDrazick Tuesday, December 22, 2009 10:16:21 AM

Those are great news.
How is it compared to V8, Nitro?

Performance is the main feature I chose browser by.
Carakan might bring me back to Opera.

Purdi Tuesday, December 22, 2009 10:20:40 AM

Originally posted by Drazick:

How is it compared to V8, Nitro?


Carakan is the fastest engine right now.

ritmocafe Tuesday, December 22, 2009 10:30:37 AM

WOW! Do opera put you guys on steroids at work?

bachokocho Tuesday, December 22, 2009 10:32:29 AM

Originally posted by Purdi:

Carakan is the fastest engine right now.



Purdi, would you post some data! My observations show that it has the qualities but needs polishing smile

Purdi Tuesday, December 22, 2009 11:07:49 AM

Originally posted by bachokocho:

Purdi, would you post some data! My observations show that it has the qualities but needs polishing


No, it's definitely faster. Try Sunspider. Ignore the V8 benchmark as it's crap, by the way. Google loves cheating.

Daniel HendrycksDanielHendrycks Tuesday, December 22, 2009 12:46:04 PM

yes

Pallab DeIndyan Tuesday, December 22, 2009 1:51:34 PM

Most of it was greek to me, but I enjoyed reading it anyway.

Originally posted by bachokocho:

Purdi, would you post some data! My observations show that it has the qualities but needs polishing


You can find some data here. Opera is now indeed the fastest browser. It feels really nice, since Opera was always known for its speed.

Teoumbra-tenebris Tuesday, December 22, 2009 3:12:20 PM

If I remember correctly, Lua 5.0 has employed a register-based VM since 2003 and it consistently held the crown as the fastest script language since then. I was mildly shocked to learn that Opera haven't used this idea till now.

I've always liked Opera for the comfort it gives me with its thousands of little helping functions but I would not turn down a faster js engine. Keep up the good work, we expect nothing less than technical marvels from you, guys and gals!

Bruno Casanobrunitoc Tuesday, December 22, 2009 4:12:37 PM

Great article !!!! Even more incredible release !!! WOOW great job guys !!!! bigsmile

Happy holidays !!!

Pallab DeIndyan Tuesday, December 22, 2009 6:28:16 PM

Originally posted by umbra-tenebris:

I've always liked Opera for the comfort it gives me with its thousands of little helping functions


+1

Cutting Spoonhellspork Tuesday, December 22, 2009 6:54:55 PM

@Teo: For that matter, some Perl implementations have used partial compiles from the beginning of the language. Java and JavaScript are managed-code specifications for cross-platform use. I've mostly only seen the simplified MurgaLUA, and then only under Linux. For that matter both Ruby and Python have made strides in JIT, and .sh objects continue to become ever more efficient. A custom, cross-platform JavaScript engine with on-the-fly transcoding is a big deal. Jens Lindström stated that Carakan can generate native x86 and x86-64, with plans for ARM and other platforms soon (GP-GPU?). I don't know if any other browser's JIT implementation has gotten to the matter of non-x86 platforms yet; possibly Fennec or the ChromeOS for ARM platforms?

I am very grateful for this candid disclosure of new features and limitations; looks like Carakan has a great deal of potential. If Opera can deliver on CPU savings and reduced memory requirements with JIT, then Opera Mobile with JIT will be extraordinarily capable. 3x-5x CPU efficiency and up to 10x memory efficiency would allow ANY modern JS/Gears-based site to run on a smartphone.

vikyboss Tuesday, December 22, 2009 7:49:14 PM

Keep up the work, develop the engine. Update it often and thats the way Opera can get good. Thanks for the first superb build..

Teoumbra-tenebris Tuesday, December 22, 2009 8:20:45 PM

It seems you misunderstood me. I consider Carakan to be a marvelous technical achievement. I was surprised that Futhark is that speedy without using a registered vm.

Daniel HendrycksDanielHendrycks Tuesday, December 22, 2009 11:10:30 PM

up Fearphage?

Charles SchlossChas4 Wednesday, December 23, 2009 1:38:03 AM

up

Ice ArdorIceArdor Wednesday, December 23, 2009 9:07:15 AM

Thanks for the details. I've been kinda curious how the "new engines" are being produced.

Jimtoyotabedzrock Thursday, December 24, 2009 3:19:20 AM

@Pallab De: What kinda system are u running? The V8 bench gives me a score of 2300-2500. Also you should compare the V3 and V4 of SRWare Iron next time, they generally run a little faster than Chrome

DailyTech.com has a nice article on the recent betas

Witold Barylukmovax Friday, January 1, 2010 6:59:15 AM

JIT works beutifully. It can be easly seen on the Peacekeeper benchmark. In "colliding balls" and "water on canvas" tests, you will see that it begins with quite good performance (comparing to other browsers), but after about 3 second, it momentally switches to ultra-smooth animation. Carakan just JITed some javascript! smile Unfortunetlly carakan does't cache this JITed code (which can be seen be restarting benchmark).

Good work guys. It is one of the most amazing piece of enginering in Opera. Only LuaJIT amazed me more smile Keep going.

Rafald.i.z. Friday, January 1, 2010 4:23:42 PM

but after about 3 second, it momentally switches to ultra-smooth animation. Carakan just JITed some javascript!


Not sure it's correct analysis of what's happening. As far as I understand, Carakan tries JIT first and eventually fallbacks to interpreter. Not other way around.

Cutting Spoonhellspork Saturday, January 2, 2010 1:31:19 AM

On examination, some of those test cases primarily benefit from the new Vega system. We won't see the true benefit of Carakan until compile-support is more streamlined for complex operations.

Jens LindströmJensL Sunday, January 3, 2010 8:24:18 AM

Originally posted by d.i.z.:

As far as I understand, Carakan tries JIT first and eventually fallbacks to interpreter.



No, all code is executed by the interpreter initially. Only code that we notice is executed "a lot" gets JIT:ed. This avoids JIT:ing entire large JS libraries linked in by pages if only small parts of them are used much.

But then even JIT:ed code falls back to the interpreter in complex cases. As cases get more complex, we'd need to generate more and more JIT code to handle them while typically gaining less and less performance by doing so.

Luis Miguel Gonzálezluismgz Thursday, February 4, 2010 2:22:51 AM

Originally posted by JensL:

No, all code is executed by the interpreter initially. Only code that we notice is executed "a lot" gets JIT:ed. This avoids JIT:ing entire large JS libraries linked in by pages if only small parts of them are used much.



Jens, I see that Carakan, as well as Nitro and Tracemonkey, interprets code until it gets hot before generating native code.

However, V8 generates native code upfront, with no intermediate bytecode nor interpretation. This is where my head explodes...
How can v8 generate native code without interpreting (and gathering type information) first?
I suppose it performs generic operations first and then patches the code while gathering type feedback, but can it be done without an interpreter?

Sorry to ask about this, but I'd like to know the different approaches taken by these engines to know them better.

ЕвгенийEvgenyIst Sunday, March 7, 2010 2:25:04 PM

Greetings to all project developers "Carakan" engine!

Your java-script engine now the fastest. I congratulate! But. It fast only on fast configurations.

And how business on not so powerful computers (and netbooks) is? How on the optimisation score? Here all not so is good, as it would be desirable.

Look as browser Chrome (5.0 beta) on weak computers looks. Java-script engine V8 is optimised on much better.

Look here:

http://clients.futuremark.com/peacekeeper/results.action?key=35wu

http://dromaeo.com/?id=95846 - Chrome 5.0.342.2

http://dromaeo.com/?id=95808 - Opera 10.50

Configuration: Athlon 2500 + (Barton), 1Gb memory, OS Win7 Ultimate

Cutting Spoonhellspork Monday, March 8, 2010 5:30:18 PM

http://impact.crhc.illinois.edu/ece512/papers/Athlon.pdf
(PS: Holy crap, restoring a tab remembered the PDF scroll position!)

The problem is that 10.50 requires SSE2 for best performance of the new JIT engine in Carakan. The AthlonXP does not support SSE2, only SSE. As you can see in your own results, only the very newest Chrome 5.x candidates have begun to address this issue in Chrome. Given Opera has supported PPC chips with Carakan, support may eventually come to pre-SSE2 x86 CPUs also.

Looking at the extended graph in Peacekeeper, I think Opera should run about as fast as Chrome in normal browsing on your PC. The rendering and complex graphics numbers are important.

The below links are detailed articles on the leap from AthlonXP to Athlon64; even if Carakan supports plain SSE, your installation of Opera may "only" be about as fast as Chrome due to limitations in the AthlonXP's SSE throughput.

http://www.chip-architect.com/news/2001_10_02_Hammer_microarchitecture.html
http://www.chip-architect.com/news/2002_06_24_Hammers_Two_Extra_PipelineStages.html

EDIT: Even netbooks with the Intel Atom or AMD Neo support modern SSE2 instructions, and Opera 10.50 is much faster than Chrome on most of these machines. Even netbooks have improvements that allow them to be faster than your AthlonXP 2500+.

ЕвгенийEvgenyIst Monday, March 8, 2010 6:09:26 PM

Many thanks for the detailed answer! Now to me all is clear...smile

Witold Barylukmovax Wednesday, March 10, 2010 5:06:07 AM

What is actually the reason for SSE2? As i understand it is used for floating points math right? What important functions are in SSE2 which aren't in SSE1?

And are you using SSE2 ability to perform float operation on two independnd double precission values in parallel? (vectorization).

If no, then isn't there a memory and memory bandwidth losses involved?

I don't know if there was any kind of reaserach on vectorizing JIT especially for dynamic language, but it can be very very interesting. (For single precision math it can give even bigger speed improvement for numerically intensive codes).

I understand that SSE was implemnted first becuase it is only FPU on the amd64 (and compatible) processors right?

So will there be SSE1, or better just 387 support?

Also as we have now PPC support, i have another question: are you using normal FPU or Altivec?

Jens LindströmJensL Wednesday, March 10, 2010 11:50:04 AM

What SSE2 adds, that we are using in Carakan, are simple add, subtract, multiply and divide operations on double precision floating point values, and 8 (or 16 in 64-bit mode) general purpose double precision floating point registers. SSE1 only supported single precision floating point values, I think. We don't use any of the SIMD components of the instruction set.

The x87 FPU is stack based, and not quite as convenient to work with (that is, generate code for.) We have support for using the x87 FPU instead, but that support was not prioritized and not tested or bug-fixed enough before the release, so it is currently not usable in practice.

On PPC we don't support JIT at all at the moment.

Cutting Spoonhellspork Wednesday, March 10, 2010 5:02:53 PM

Well, this makes a lot more sense. Not using JIT on PPC is no great surprise, even getting the engine to work at all must have difficult initially.

Aren't x87 instructions disabled in Long Mode? I seem to recall hearing that x86-64 removed several process priority rings, and dropped x87 in favor of SSE/SSE2. It initially broke the Xen hypervisor in Long Mode, and hand-tuned x87 assembly language ran up to seven times slower in Long Mode.

EDIT: Two questions, first is whether JIT can be explicitly enabled or disabled in some config document. Second is whether lack of JIT on old CPUs produces a significant memory savings.

Jens LindströmJensL Thursday, March 11, 2010 9:02:27 AM

I'm not aware of x87 instructions being disabled in 64-bit mode. But all x86 CPUs that support the 64-bit extensions also support at least SSE2, so there's little point in using x87 math if you're compiling 64-bit code. GCC will for instance typically generate all x87 math when compiling 32-bit code, since the executable might be run on a CPU without SSE2, but all SSE2 math when compiling 64-bit code. Carakan's support for using x87 math is also 32-bit only (even when enabled.) In 64-bit builds, use of SSE2 is unconditional.

JIT can be disabled. Open opera:config and search for "JIT".

Lack of JIT on older CPUs or on PPC does save some memory, but the savings is usually not dramatic. The amount of memory used by other parts of the engine is typically significantly larger.

Witold Barylukmovax Friday, March 12, 2010 2:40:29 AM

Thanks Jens for answers. Makes sense for me. smile

Are you thinking about ARM JIT?

AFAIK Opera Mobile is written in Java ME so is not using Carakan engine right which is probably in C++ + platform assembly tricks? I'm asking because ARM is gaining momentum not only in embeded market, and phones, but also some new netbooks/smartbooks have them. Most of them are based on some Linux (or Chromium OS which is essentially Linux, or Android which is also Linux, but apps are in Java rigth?)

This is important question becuase most users on smartbooks will only have possibility to run Firefox or Chrom(ium), without Opera. Which is quite distracting given that this will be highly mobile and small devices, so Opera should support just it like Opera Mini smile. Emulation of i386 is no go on such platforms, and emulating Java ME (it should be hard) will only bring Opera Mini which is designed for very-very small screens not screens on which normal browser layout is better.

Cutting Spoonhellspork Friday, March 12, 2010 8:49:01 PM

Jens: I did some digging. Part of the issues was wow64, which automatically converted hand-tuned x87 into preferred x64 instructions. This greatly degraded performance. The other problem was early documentation, which indicated that x87 flexibility had been crippled by CPU bugs. The current proper state of things, is that 1) x87 should work fine in Long Mode if the right bits are set, and 2) SSE2 and later instructions are faster on any x64-capable chip. GCC has roughly gotten it right. Thank you for pointing me to the config, will make it easier to isolate potential crashers.

Witold: Mini is J2ME/MIDP2 spec, that's why it works on most phones. Mobile must be compiled for the platform. Opera has begun compiling Mini for certain platforms as well. As the netbook and IPTV markets continue to develop, Opera Mobile and Opera Devices will continue to find new homes on new hardware. If/when Mobile has access to both hardware-accelerated graphics AND on-device JIT, an equipped smartphone should browse about as fast as Opera 10.10 on an older PC.

/rimshot: I actually use Opera Mini on one of my PCs. It is blazingly fast and handles the large screen perfectly. It'd be nice to see Opera Devices in an Expressgate-style instant web browser.

Cutting Spoonhellspork Wednesday, March 17, 2010 5:04:12 PM

That ought to be 10.50 Final, which performs well (in the limited tests Microsoft has confidence with). Most of Opera's failures in that small set are not critical.

Though, perhaps IE9 will strike a bit closer when it hits Sputnik?

Cutting Spoonhellspork Wednesday, March 17, 2010 9:52:59 PM

http://www.codedread.com/svg-support.php

Not by any means inclusive, but Haavard brought it up. There's a thread on his blog page. Most of the tests failed by Opera are really not important, in my opinion.

Witold Barylukmovax Thursday, March 18, 2010 12:19:01 AM

Originally posted by hellspork:

Most of the tests failed by Opera are really not important, in my opinion.



I found this test http://ie.microsoft.com/testdrive/Graphics/35SVG--oids/Default.xhtml and it creates from time to time some spourious black lines on the screen. Maybe this is only problem on Linux 10.50 snapshot.

I know Opera have good SVG implementation, and IE9 test page is probably cheating with only some picked tests, and not universal and wide as other tests.

Unfortunetly still I found real world SVG in the wild internet which have artifacts in Opera. I don't know if this are actually badly created SVGs or Opera fault. :/ Will try to find one and report some as a bug.

Anyway this is offtopic. Hope final release of IE9 will have acceptable support for SVG, and it will bring wider adoption of this vector language on the Web.


PS. I actually retested some test from SVG Suite on 10.51 on Linux, and many tests fails when in table it is that 10.50 on Windows passes. sad

Daniel HendrycksDanielHendrycks Thursday, March 18, 2010 12:42:10 AM

Originally posted by Witold Baryluk:

Maybe this is only problem on Linux 10.50 snapshot.


Here on Windows too.

Originally posted by Witold Baryluk:

Hope final release of IE9 will have acceptable support for SVG


IE9 has about 30% of SVG implemented. Not 100%. I hope IE9 gets WebGL and Theora.

Cutting Spoonhellspork Thursday, March 18, 2010 2:50:51 AM

Some SGV is written for a specific browser or plugin, even if it violates the spec they just "fix it until it works"....in that one program.

Some SVG only works with specific vector drawing programs or printing methods, and does not even belong to the true spec.

There are also several types of SVG. Just put "w SVG" [Enter]

Most of the Test Drive pages work great in 10.51 RC1 Windows, and most of the 10.5x fixes are universal core bugs.

Robert BallEbola_Influenza Tuesday, March 23, 2010 6:16:58 PM

wow, i just read this page and this: http://www.opera.com/docs/changelogs/mac/1050b1 .

you guys have been BUSY! it all sounds excellent...good job.

Johnhandsometechnews Friday, April 23, 2010 6:37:48 AM

Great article and excellent job!

Keldian.-Keldian Thursday, May 13, 2010 3:27:42 PM

Great work, indeed. It's really enjoyable to see ow Opera is currently struggling the first place with Chrome on those synthetic JavaScript tests, and leaving the rest of competition far behind. yes

I've even noticed how these days people are really starting to take Opera seriously into account, even old-seasoned Firefox users. Success to all of you!

James A Hamilton, Jr.BiggestJim Wednesday, May 19, 2010 7:34:10 PM

I am so impressed with Opera that I am giddy with excitement. I just made it my default browser 2 days ago and I am not going back! Internet Explorer is dead to me. I had upgraded to IE8 from IE6 after installing SP3 for WinXP - fully expecting IE8 to be a huge improvement - and was sorely disappointed. It was slow and cumbersome and remained so even after I applied the recommended tweaks to improve speed. There was no memory cleanup so it would eventually just crash. It would hang at certain sites while it loaded scripts. It would remain sluggish at certain sites. It would take forever to load pages. I tried Opera on these same sites and was instantly surprised. None of the problems I was having with IE. And now I know why. Impressive! And I must say that I wasn't too surprised to discover that Opera also does some site patching. What else could explain all these sites suddenly functioning properly or at least better.

Opera has clearly been carefully thought out down to surprising detail (I'm discovering) Congratulations and keep up the great work!

Thanks for the great browser - I am here to stay. yes

Pavel Stverakpavst Sunday, May 23, 2010 2:08:04 PM

Hmm, it appears I'm one of the few who found upgrading to 10.53 from 10.10 disappointing.

I have really bad results rendering JS and JA. Spending some time on Opera site I learned Rendering end Java Script engines are new, but it appears there are things still to be heavily improved. Spending several hours with changing plugins, uninstalling, reinstalling had two simple results.

1. too much coffee in my veins
2. returning to 10.10

I cross my fingers to get things sorted out soon!

Cutting Spoonhellspork Sunday, May 23, 2010 11:01:25 PM

Does your CPU support SSE2? If not, the performance gains will be less. If your performance went down, try a clean install beside your old version.

Pavel Stverakpavst Monday, May 24, 2010 11:10:19 AM

Intel Core 2 Duo - so yes, it does. The issue shows up just with Opera 10.5x. Opera 10.10 runs much more faster. Firefox/Safari are ok, too. MAC OSX 10.5.8 Complete uninstall / install was part of experiments I did ... :(
Checking the discussions seem I'm not the only one ...

Cutting Spoonhellspork Monday, May 24, 2010 3:37:52 PM

Then I guess there are still a few strange bugs on the mac side.

Pavel Stverakpavst Wednesday, May 26, 2010 12:24:42 PM

hmm, I'm just curious, why other browsers are ok ... I used to meet the opposite: Opera being ok, where the others failed ...

Write a comment

You must be logged in to write a comment. If you're not a registered member, please sign up.