Carakan Revisited
By Jens LindströmJensL. Tuesday, December 22, 2009 7:50:53 AM
A little more than a year has passed since we launched the Carakan project, aimed at drastically improving Opera's ECMAScript execution performance, and it's finally time for the first labs release of Opera with the Carakan ECMAScript engine.
What we set out to implement over a year ago was, as I described in a previous post about the Carakan project, a new cross-platform bytecode interpreter for a new register-based instruction set, a new internal object model with automatic classification and inline property caching, and machine code generation. All this we've done, and then some.
The new bytecode interpreter and new object model are cross-platform, meaning they will work on any hardware platform Opera is ported to. On their own, they already give a significant performance boost compared to Futhark, the engine used in all current versions of Opera. Running on a regular desktop computer, Carakan's bytecode interpreter is around 3.5 times faster than Futhark on the Sunspider benchmark, and on embedded systems running less powerful CPUs early testing shows it to have an even greater advantage over Futhark.
For optimum performance, however, machine code generation, or JIT, is the way to go, and this is where we have focused most of our optimization work. Carakan is equipped with a hot-spot detecting JIT compiler that generates machine code that performs all but the most complex operations directly without calling the bytecode interpreter. It employs a combination of compile-time static analysis of the program and runtime profiling in the bytecode engine to optimize the generated code, focusing in particular on code that does arithmetic calculations. It also performs function inlining, both of simple built-in functions such as Math.sqrt() and String.charCodeAt() and of functions implemented in the script. Currently the JIT compiler only supports generating 32 or 64 bit x86 machine code, but support for other architectures will be added in time, starting with the ARM architecture.
But this is not all we have done in the Carakan project. I'd like to also mention two other interesting improvements that we've implemented compared to Futhark: a divided garbage collected heap, and caching of compiled scripts.
Divided garbage collected heap
The ECMAScript language assumes the presence of a garbage collector that automatically reclaims memory occupied by objects that are no longer needed. Carakan's garbage collector is very similar to the one used in Futhark, a basic mark-and-sweep design; we've only done some smaller, but rather effective, tweaking of its performance. We have however drastically changed how we use the garbage collector. In Futhark, all memory allocated by the ECMAScript engine for scripts running in any tab was allocated from a single shared heap, and anytime the garbage collector needed to run to free up memory, it would traverse all allocated memory. The more open tabs there were, the more expensive would a garbage collection become.
In Carakan, we instead use many smaller heaps. Each document loaded in a tab, or in a frame or iframe inside another document, gets its own. Since scripts running in different documents can sometimes access each other's objects, we have support for merging two heaps into one, and for detecting when this is necessary. The advantage of this design is clear: with smaller heaps, each garbage collection is cheaper. And since we only need to run the garbage collector on heaps from which memory has been allocated, we automatically only traverse the memory of active heaps, and leave all other heaps alone. The end result is that it doesn't matter if there are 1 or 100 open tabs; when a script triggers a garbage collection, the cost is the same.
Cached compiled programs
An aspect of an ECMAScript engine that performance benchmarks often don't measure is the performance of the compiler. Compared to Futhark, the Carakan compiler is much more focused on analysing the program and generating code that will execute fast, and may therefore be slightly slower in some cases. This is a trade-off we've made willingly.
Instead of the very efficient compiler in Futhark, Carakan brings caching of compiled programs. In practice this means that whenever a script program is about to be compiled, whose source code is identical to that of some other program that was recently compiled, we reuse the previous output from the compiler and skip the compilation step entirely. This cache is quite effective in typical browsing scenarios where one loads page after page from the same site, such as different news articles from a news service, since each page often loads the same, sometimes very large, script library.
Plans for the future
Although we're nearing the release of the Carakan engine, we don't plan to stop development of it. We have plenty of ideas on smaller and larger improvements to make, and we will also port the JIT compiler to other CPU architectures.
One area where we believe we can improve greatly is in memory usage, by switching to a much more efficient object representation. Carakan will already today use less memory than Futhark in some cases, by sharing information between similar objects via the automatic object classification system and by sharing literal data using a copy-on-write scheme, but we have plans that would reduce the size of ECMAScript objects to as little as a tenth of their current size.
We will also be looking at improving the performance of machine code generated for non-arithmetic code such as property accesses, where our JIT compiler currently produces significantly less stream-lined code than it does for arithmetic calculations.

1 2 Next »
Ken Rushiakrushia # Tuesday, December 22, 2009 7:56:52 AM
Øzikzakatak # Tuesday, December 22, 2009 7:59:04 AM
Tuttle # Tuesday, December 22, 2009 8:10:21 AM
d4rkn1ght # Tuesday, December 22, 2009 8:31:50 AM
Ian Doyletalam # Tuesday, December 22, 2009 8:45:56 AM
RoyiDrazick # Tuesday, December 22, 2009 10:16:21 AM
How is it compared to V8, Nitro?
Performance is the main feature I chose browser by.
Carakan might bring me back to Opera.
Purdi # Tuesday, December 22, 2009 10:20:40 AM
Originally posted by Drazick:
Carakan is the fastest engine right now.
ritmocafe # Tuesday, December 22, 2009 10:30:37 AM
bachokocho # Tuesday, December 22, 2009 10:32:29 AM
Originally posted by Purdi:
Purdi, would you post some data! My observations show that it has the qualities but needs polishing
Purdi # Tuesday, December 22, 2009 11:07:49 AM
Originally posted by bachokocho:
No, it's definitely faster. Try Sunspider. Ignore the V8 benchmark as it's crap, by the way. Google loves cheating.
Daniel HendrycksDanielHendrycks # Tuesday, December 22, 2009 12:46:04 PM
Pallab DeIndyan # Tuesday, December 22, 2009 1:51:34 PM
Originally posted by bachokocho:
You can find some data here. Opera is now indeed the fastest browser. It feels really nice, since Opera was always known for its speed.
Teoumbra-tenebris # Tuesday, December 22, 2009 3:12:20 PM
I've always liked Opera for the comfort it gives me with its thousands of little helping functions but I would not turn down a faster js engine. Keep up the good work, we expect nothing less than technical marvels from you, guys and gals!
Bruno Casanobrunitoc # Tuesday, December 22, 2009 4:12:37 PM
Happy holidays !!!
Pallab DeIndyan # Tuesday, December 22, 2009 6:28:16 PM
Originally posted by umbra-tenebris:
+1
Cutting Spoonhellspork # Tuesday, December 22, 2009 6:54:55 PM
I am very grateful for this candid disclosure of new features and limitations; looks like Carakan has a great deal of potential. If Opera can deliver on CPU savings and reduced memory requirements with JIT, then Opera Mobile with JIT will be extraordinarily capable. 3x-5x CPU efficiency and up to 10x memory efficiency would allow ANY modern JS/Gears-based site to run on a smartphone.
vikyboss # Tuesday, December 22, 2009 7:49:14 PM
Teoumbra-tenebris # Tuesday, December 22, 2009 8:20:45 PM
Daniel HendrycksDanielHendrycks # Tuesday, December 22, 2009 11:10:30 PM
Charles SchlossChas4 # Wednesday, December 23, 2009 1:38:03 AM
Ice ArdorIceArdor # Wednesday, December 23, 2009 9:07:15 AM
Jimtoyotabedzrock # Thursday, December 24, 2009 3:19:20 AM
DailyTech.com has a nice article on the recent betas
Witold Barylukmovax # Friday, January 1, 2010 6:59:15 AM
Good work guys. It is one of the most amazing piece of enginering in Opera. Only LuaJIT amazed me more
Rafald.i.z. # Friday, January 1, 2010 4:23:42 PM
Not sure it's correct analysis of what's happening. As far as I understand, Carakan tries JIT first and eventually fallbacks to interpreter. Not other way around.
Cutting Spoonhellspork # Saturday, January 2, 2010 1:31:19 AM
Jens LindströmJensL # Sunday, January 3, 2010 8:24:18 AM
Originally posted by d.i.z.:
No, all code is executed by the interpreter initially. Only code that we notice is executed "a lot" gets JIT:ed. This avoids JIT:ing entire large JS libraries linked in by pages if only small parts of them are used much.
But then even JIT:ed code falls back to the interpreter in complex cases. As cases get more complex, we'd need to generate more and more JIT code to handle them while typically gaining less and less performance by doing so.
Luis Miguel Gonzálezluismgz # Thursday, February 4, 2010 2:22:51 AM
Originally posted by JensL:
Jens, I see that Carakan, as well as Nitro and Tracemonkey, interprets code until it gets hot before generating native code.
However, V8 generates native code upfront, with no intermediate bytecode nor interpretation. This is where my head explodes...
How can v8 generate native code without interpreting (and gathering type information) first?
I suppose it performs generic operations first and then patches the code while gathering type feedback, but can it be done without an interpreter?
Sorry to ask about this, but I'd like to know the different approaches taken by these engines to know them better.
ЕвгенийEvgenyIst # Sunday, March 7, 2010 2:25:04 PM
Your java-script engine now the fastest. I congratulate! But. It fast only on fast configurations.
And how business on not so powerful computers (and netbooks) is? How on the optimisation score? Here all not so is good, as it would be desirable.
Look as browser Chrome (5.0 beta) on weak computers looks. Java-script engine V8 is optimised on much better.
Look here:
http://clients.futuremark.com/peacekeeper/results.action?key=35wu
http://dromaeo.com/?id=95846 - Chrome 5.0.342.2
http://dromaeo.com/?id=95808 - Opera 10.50
Configuration: Athlon 2500 + (Barton), 1Gb memory, OS Win7 Ultimate
Cutting Spoonhellspork # Monday, March 8, 2010 5:30:18 PM
(PS: Holy crap, restoring a tab remembered the PDF scroll position!)
The problem is that 10.50 requires SSE2 for best performance of the new JIT engine in Carakan. The AthlonXP does not support SSE2, only SSE. As you can see in your own results, only the very newest Chrome 5.x candidates have begun to address this issue in Chrome. Given Opera has supported PPC chips with Carakan, support may eventually come to pre-SSE2 x86 CPUs also.
Looking at the extended graph in Peacekeeper, I think Opera should run about as fast as Chrome in normal browsing on your PC. The rendering and complex graphics numbers are important.
The below links are detailed articles on the leap from AthlonXP to Athlon64; even if Carakan supports plain SSE, your installation of Opera may "only" be about as fast as Chrome due to limitations in the AthlonXP's SSE throughput.
http://www.chip-architect.com/news/2001_10_02_Hammer_microarchitecture.html
http://www.chip-architect.com/news/2002_06_24_Hammers_Two_Extra_PipelineStages.html
EDIT: Even netbooks with the Intel Atom or AMD Neo support modern SSE2 instructions, and Opera 10.50 is much faster than Chrome on most of these machines. Even netbooks have improvements that allow them to be faster than your AthlonXP 2500+.
ЕвгенийEvgenyIst # Monday, March 8, 2010 6:09:26 PM
Witold Barylukmovax # Wednesday, March 10, 2010 5:06:07 AM
And are you using SSE2 ability to perform float operation on two independnd double precission values in parallel? (vectorization).
If no, then isn't there a memory and memory bandwidth losses involved?
I don't know if there was any kind of reaserach on vectorizing JIT especially for dynamic language, but it can be very very interesting. (For single precision math it can give even bigger speed improvement for numerically intensive codes).
I understand that SSE was implemnted first becuase it is only FPU on the amd64 (and compatible) processors right?
So will there be SSE1, or better just 387 support?
Also as we have now PPC support, i have another question: are you using normal FPU or Altivec?
Jens LindströmJensL # Wednesday, March 10, 2010 11:50:04 AM
The x87 FPU is stack based, and not quite as convenient to work with (that is, generate code for.) We have support for using the x87 FPU instead, but that support was not prioritized and not tested or bug-fixed enough before the release, so it is currently not usable in practice.
On PPC we don't support JIT at all at the moment.
Cutting Spoonhellspork # Wednesday, March 10, 2010 5:02:53 PM
Aren't x87 instructions disabled in Long Mode? I seem to recall hearing that x86-64 removed several process priority rings, and dropped x87 in favor of SSE/SSE2. It initially broke the Xen hypervisor in Long Mode, and hand-tuned x87 assembly language ran up to seven times slower in Long Mode.
EDIT: Two questions, first is whether JIT can be explicitly enabled or disabled in some config document. Second is whether lack of JIT on old CPUs produces a significant memory savings.
Jens LindströmJensL # Thursday, March 11, 2010 9:02:27 AM
JIT can be disabled. Open opera:config and search for "JIT".
Lack of JIT on older CPUs or on PPC does save some memory, but the savings is usually not dramatic. The amount of memory used by other parts of the engine is typically significantly larger.
Witold Barylukmovax # Friday, March 12, 2010 2:40:29 AM
Are you thinking about ARM JIT?
AFAIK Opera Mobile is written in Java ME so is not using Carakan engine right which is probably in C++ + platform assembly tricks? I'm asking because ARM is gaining momentum not only in embeded market, and phones, but also some new netbooks/smartbooks have them. Most of them are based on some Linux (or Chromium OS which is essentially Linux, or Android which is also Linux, but apps are in Java rigth?)
This is important question becuase most users on smartbooks will only have possibility to run Firefox or Chrom(ium), without Opera. Which is quite distracting given that this will be highly mobile and small devices, so Opera should support just it like Opera Mini
Cutting Spoonhellspork # Friday, March 12, 2010 8:49:01 PM
Witold: Mini is J2ME/MIDP2 spec, that's why it works on most phones. Mobile must be compiled for the platform. Opera has begun compiling Mini for certain platforms as well. As the netbook and IPTV markets continue to develop, Opera Mobile and Opera Devices will continue to find new homes on new hardware. If/when Mobile has access to both hardware-accelerated graphics AND on-device JIT, an equipped smartphone should browse about as fast as Opera 10.10 on an older PC.
/rimshot: I actually use Opera Mini on one of my PCs. It is blazingly fast and handles the large screen perfectly. It'd be nice to see Opera Devices in an Expressgate-style instant web browser.
Cutting Spoonhellspork # Wednesday, March 17, 2010 5:04:12 PM
Though, perhaps IE9 will strike a bit closer when it hits Sputnik?
Cutting Spoonhellspork # Wednesday, March 17, 2010 9:52:59 PM
Not by any means inclusive, but Haavard brought it up. There's a thread on his blog page. Most of the tests failed by Opera are really not important, in my opinion.
Witold Barylukmovax # Thursday, March 18, 2010 12:19:01 AM
Originally posted by hellspork:
I found this test http://ie.microsoft.com/testdrive/Graphics/35SVG--oids/Default.xhtml and it creates from time to time some spourious black lines on the screen. Maybe this is only problem on Linux 10.50 snapshot.
I know Opera have good SVG implementation, and IE9 test page is probably cheating with only some picked tests, and not universal and wide as other tests.
Unfortunetly still I found real world SVG in the wild internet which have artifacts in Opera. I don't know if this are actually badly created SVGs or Opera fault. :/ Will try to find one and report some as a bug.
Anyway this is offtopic. Hope final release of IE9 will have acceptable support for SVG, and it will bring wider adoption of this vector language on the Web.
PS. I actually retested some test from SVG Suite on 10.51 on Linux, and many tests fails when in table it is that 10.50 on Windows passes.
Daniel HendrycksDanielHendrycks # Thursday, March 18, 2010 12:42:10 AM
Originally posted by Witold Baryluk:
Here on Windows too.
Originally posted by Witold Baryluk:
IE9 has about 30% of SVG implemented. Not 100%. I hope IE9 gets WebGL and Theora.
Cutting Spoonhellspork # Thursday, March 18, 2010 2:50:51 AM
Some SVG only works with specific vector drawing programs or printing methods, and does not even belong to the true spec.
There are also several types of SVG. Just put "w SVG" [Enter]
Most of the Test Drive pages work great in 10.51 RC1 Windows, and most of the 10.5x fixes are universal core bugs.
Robert BallEbola_Influenza # Tuesday, March 23, 2010 6:16:58 PM
you guys have been BUSY! it all sounds excellent...good job.
Johnhandsometechnews # Friday, April 23, 2010 6:37:48 AM
Keldian.-Keldian # Thursday, May 13, 2010 3:27:42 PM
I've even noticed how these days people are really starting to take Opera seriously into account, even old-seasoned Firefox users. Success to all of you!
James A Hamilton, Jr.BiggestJim # Wednesday, May 19, 2010 7:34:10 PM
Opera has clearly been carefully thought out down to surprising detail (I'm discovering) Congratulations and keep up the great work!
Thanks for the great browser - I am here to stay.
Pavel Stverakpavst # Sunday, May 23, 2010 2:08:04 PM
I have really bad results rendering JS and JA. Spending some time on Opera site I learned Rendering end Java Script engines are new, but it appears there are things still to be heavily improved. Spending several hours with changing plugins, uninstalling, reinstalling had two simple results.
1. too much coffee in my veins
2. returning to 10.10
I cross my fingers to get things sorted out soon!
Cutting Spoonhellspork # Sunday, May 23, 2010 11:01:25 PM
Pavel Stverakpavst # Monday, May 24, 2010 11:10:19 AM
Checking the discussions seem I'm not the only one ...
Cutting Spoonhellspork # Monday, May 24, 2010 3:37:52 PM
Pavel Stverakpavst # Wednesday, May 26, 2010 12:24:42 PM