# Erik's blog

## Alpha blending

This time I thought I'd talk a little bit about alpha blending and premultiplied alpha. The whole post is really going to boil down to one thing, premultiplied alpha is good, but it's not commutative! But to get there let's cover some of the background (hee-hee).

Integral alpha was invented back in the 1970s by Ed Catmull and Alvy Ray Smith while at New York Tech, but really, the paper which is most often referenced when it comes to this is the 1984 paper by Thomas Porter and Tom Duff (also know for the Duff-device). In it they describe in addition to the 3-component (red, green, blue) pixel color the use of a fourth component to convey coverage information. It's referred to as the matte component or alpha as it's commonly called today. The most illustrative name would really be opacity as 0 means fully transparent, 1 means fully opaque and anything in between various levels of semi-transparency.

The standard blending operation is called "over". In OpenGL it's achieved by glBlendEquation(GL_FUNC_ADD) and glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA) which equates to the following formula for calculating the result of "source" being drawn over "destination":

```result.rgba = source.alpha * source.rgba + (1 - source.alpha) * destination.rbga
```

[In OpenGL you need to first call glEnable(GL_BLEND) as it defaults to off. GL_FUNC_ADD is also the default so you can omit the call to glBlendEquation.]

By rendering the semi-transparent objects back to front you get the expected result. So, for example drawing semi transparent red (1,0,0,0.5) on top of opaque white (1,1,1,1) you calculate

```r = 1 * 0.5 + (1 - 0.5) * 1 = 1
g = 0 * 0.5 + (1 - 0.5) * 1 = 0.5
b = 0 * 0.5 + (1 - 0.5) * 1 = 0.5
a = 0.5 * 0.5 + (1 - 0.5) * 1 = 0.75
```

The interesting bit here is that the resulting alpha goes from fully opaque to semi-transparent. This physical phenomenom is called Akin-Romney-Refraction and can be seen in this image. The white table serves as the opaque surface and by holding up the plastic sheet in front of it light gets refracted and you can see what's behind the table.

Ok, it doesn't take a genius to figure out that's a load of croc. In fact GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA doesn't correctly calculate the resulting alpha. It may sound a bit strange and you'd be right in asking yourself "Why is it used then?". Well, when rendering to the frame buffer it doesn't matter much as displays doesn't use the resulting alpha (anyone remember the Game & Watch with the see-through screen?). It does matter if you're rendering to an FBO and expect to blend it with something else though (think impostors etc), or if the framebuffer will be composited with something else like in WebGL where the canvas is composited with the rest of the web page.

Here's a sample in WebGL showing how destination alpha gets screwed up. A WebGL canvas by default has destination alpha, so when we're filling it first with opaque white and blending a semi transparent red quad on top of it we can see the black text behind the canvas shine through even though it should be fully obscured. Not what you'd expect in the real world.

### The solution

Luckily though, Porter and Duff led the way to something better and introduced premultiplied alpha. Take a look at the formula again:

```result.rgba = source.alpha * source.rgba + (1 - source.alpha) * destination.rbga
```

You can see that the source components are always multiplied with the source alpha. That actually means that it can be done in a preprocessing step, so the image data is stored premultiplied. That changes the formula to do blending to:

```result.rgba = 1 * source.rgba + (1 - source.alpha) * destination.rbga
```

So in OpenGL speak we need to use GL_ONE, GL_ONE_MINUS_SRC_ALPHA with premultiplied alpha.

If we assume we have premultiplied indata there's something important to note about it. If each component goes from 0 to 1 and they have been multiplied by the alpha component (also in the 0-1 range) then it follows that the red, green and blue components must be less or equal to the alpha component. So while (1,0,0,0.5) represents a 50% transparent full red in straight alpha it's premultiplied counterpart would be (0.5,0,0,0.5).

If we're dealing with 32-bit pixels then each component ranging from 0 to 1 is represented by a byte or 256 distinct values from 0 to 255. So if full-bright red at 50% opacity is represented by 128 in premultiplied form then there are only 128 possible levels of red available. So at 50% opacity colors are less accurately described than at 100%. In fact the closer to fully transparent or 0 alpha we get the less accuracy we get in the color. When alpha is ~0.4% (or byte value 1) then the only red color we can describe is no red at all or all red. In practice it doesn't matter much though as the multiplication would have to be performed as soon as the blending is calculated anyway.

So, let's look at our example in premultiplied form, drawing semi transparent red (0.5,0,0,0.5) on top of opaque white (1,1,1,1) we calculate

```r = 1 * 0.5 + (1 - 0.5) * 1 = 1
g = 1 * 0 + (1 - 0.5) * 1 = 0.5
b = 1 * 0 + (1 - 0.5) * 1 = 0.5
a = 1 * 0.5 + (1 - 0.5) * 1 = 1
```

That's more like it. That's an OPAQUE light red color. Go premultiplied!

### Filtering

Let's look at another issue with alpha blending, that of alpha blending in combination with texture filtering. Let's assume we're looking at the very edge of something in a cut-out image, perhaps the left edge of the O in the Opera logo. On the right side of the edge there's opaque red (1,0,0,1), on the left side of the edge there's fully transparent (0,0,0,0). Now, just for the sake of illustration, instead of using an Opera-O, we'll use just the problematic piece of the image, a 2x1 large image containing a fully transparent pixel and an opaque red pixel to the right of that. Now, if we render that pixel aligned everything is just fine, we'll have a crisp and sharp edge, but what if we render it off pixel center, or even better for illustrative purposes we blow it up to render as 256x100 pixels. The graphics library then needs to apply texture filtering to the image.

You can see how there seems to be a blackness to the gradient. This is because when filtering is applied to generate smooth alpha values around the edge it also interpolates the rgb values which are going from (0,0,0) to (1,0,0). Remember in the straight alpha model full colored red is always represented with 1 in the red channel, regardless of it's opacity.

On the other hand in the premultiplied alpha model full bright red will be represented by the same value in the red component as the alpha component. Hence there will be no filtering errors when using premultiplied alpha. The WebGL sample shows the difference clearly side by side. You might think "Sure, it's because he picked transparent to be (0,0,0,0). If he would have picked (1,0,0,0) as transparent then it would've looked good." That's certainly true in this particular case, but for a real world image with several different colors there's just no way to make that work.

### Associative != Commutative

Premultiplied alpha blending is associative, meaning A ⊗ (B ⊗ C) = (A ⊗ B) ⊗ C.

This isn't a full analytical proof, see Wikipedia for that. But it's easy to verify that it is indeed associative by expanding the formulas below and checking that they actually amount to the same expression.

```A ⊗ (B ⊗ C)
A.rgba + (1 - A.a) * (B.rgba + (1 - B.a) * C.rgba) ⇔
A.rgba + (B.rgba + C.rgba - B.a * C.rgba) - A.a * (B.rgba + C.rgba -
B.a * C.rgba) ⇔
A.rgba + B.rgba + C.rgba - B.a * C.rgba -
A.a * B.rgba - A.a * C.rgba + A.a * B.a * C.rgba
```
```(A ⊗ B) ⊗ C
(A.rgba + (1 - A.a) * B.rgba) + (1 - (A.a + (1 - A.a) * B.a)) * C.rgba ⇔
A.rgba + B.rgba - A.a * B.rgba + (1 - (A.a + B.a - A.a * B.a)) * C.rgba ⇔
A.rgba + B.rgba - A.a * B.rgba + (C.rgba - (A.a * C.rgba + B.a * C.rgba -
A.a * B.a * C.rgba)) ⇔
A.rgba + B.rgba + C.rgba - B.a * C.rgba -
A.a * B.rgba - A.a * C.rgba + A.a * B.a * C.rgba
```

An important point to make though is that associativity is not the same as commutativity. If the premultiplied blending operator would be commutative then you could draw semi transparent elements in any order and still get the right result, but that is not the case! The simplest possible proof for that (since we know premultiplied alpha behaves like in the real world with proper resulting alpha) is the fact that if that was true then draving an opaque black color over an opaque white color would result in the same color as if you drew opaque white over opaque black!

```A ⊗ B ⊗ C ≠ C ⊗ B ⊗ A
```

Just to give another visual example check out this WebGL sample showing a series of blends. The top row draws red, green and yellow at 50% opacity on top of opaque black. In the middle row, the order of the red, green and yellow is reversed and you can see how it's clearly not producing the same result. The bottom section shows how green and yellow is first blended in one buffer before being used to blend on top of the black and red. So, associativity works, commutativity doesn't.

### Premultiplied and color > alpha

If you're a curious mind you might've already wondered what would happen if you specify a color where the color components are greater than the alpha component? You get something very commonly used for particle effects, referred to as additive alpha. Check out this WebGL demo which draws 100 random squares with the color (0.15, 0.10, 0.07, 0.05) on top of each other for a sort of fiery effect. Don't do particle systems like this though!! It's just for illustrative purposes.

So, that was all I had for this time, a quick survey of alpha blending. A shout out goes to @tim_johansson for reading this post and giving me some comments on the readability. Until next time!

## Fostering a programmer part two

A while back I posted a piece called Fostering a programmer where I made a first stab at introducing my 7-year old daughter to the wonderful world of programming. I think programming is a really valuable piece of knowledge to have in the modern society and having it as a girl makes you even more unique.

The C64 approach, amusing as it was for me, didn't really get her hooked. No surprise there I guess. The C64 just doesn't have the appeal to her that it had to me. It was the only piece of interesting technology I had available back in the early 80's. My daughter has a Chromebook, an iPhone an iPad and a Wii... making ASCII games isn't going to blow her away anytime soon, so I've been searching for other ideas.

Raspberry PI, LEGO Mindstorm, Meccano and boardgames are some of the alternatives, but just the other day I stumbled over the blog of a brilliant guy calling himself DrTechnico. He had this idea of letting kids draw small programs made up of instructions like "move left foot forward" and "turn left" that would control their parents. The goal would be to navigate some obstacles and pick up an object.

It's a truly brilliant idea, teaching your kids basic programming! Plus, it has the appeal no fancy touch-tablet can compete with. You get to play with your dad! Or.. you even get to make your dad do silly things like walk into a wall! This would be a sure thing! I was so excited about the idea of trying that on my kids that I couldn't sleep. I had to get out of bed, go downstairs and prepare for a Sunday of "Daddy-robot". I obviously needed to make an instruction sheet in Swedish so I thought it best to draw my own robot and instructions using svg and html (of course). Check out my "Daddy-robot" image, as well as the full Instruction sheet in Swedish.

The whole excercise was a complete and utter success. We both had a blast! Annika was jotting down instructions, testing the program on the daddy-bot, giggling as the daddy-bot took a wrong turn and kicked down one of the garden chairs. In the image you can see the daddy-bot has executed about 10-15 instructions and have successfully navigated past the garden chairs. It's soon about to make a right turn and head towards the sand-box.

Notice the clever but inadvertent product placement in this shot of Annika writing up a new program for the daddy-bot.

Here's a picture of Annika being very pleased with herself after having completed a 25-instruction program that took the daddy-bot past the garden chairs, around the sand box and right up to the plastic showel successfully completing the task by executing the last "pick up" instruction.

You might notice she's wearing a onepiece. Let me just take this opportunity to congratulate Norway on another awesome piece of export. Raping axe-wielding sailors, then fish, then oil and now this...a jump suite with a zipper that goes all the way up so you can't even see where you're walking. And apparently every kid needs one. Congratz Norway!

Anyway, getting back to the subject. We both had a blast, she started making up new instructions like jump and say, and in fact the day just ended with me having to write a program that made my Daughter-bot walk from the starting position in the living room out into the hallway, walk backwards into the kitchen, head into the bathroom, turn towards the sink and execute the "brush teeth" operation... that's what I call a successful game. Try it on your kids! I guarantee you'll enjoy it yourself and you're making the universe a favor by making your kids smarter.

## Talking to Opera engineers - what the heck is a core-integration-point?

We have an excellent developer relations team at Opera which does a great job and often have a better overview of things than a specific engineer. That said, you should always feel free to poke us engineers directly as well. Most of us are really friendly down to earth guys with a passion for our little corner of the Opera browser and happy to answer technical questions best we can.

If you ever find yourself at a conference talking to one of us or if you're emailing with us you'll hear the term core-integration-point being mentioned quite frequently.

Just to demystify it a bit, type in opera:about in the url bar and check under the heading Browser identification where you can find our (UA) user agent string.

Since I'm running Opera Next on Windows I get the following string

Opera/9.80 (Windows NT 6.1; U; Edition Next; en) Presto/2.10.255 Version/12.00

Looking at the bit saying Presto/2.10.255 you can read out that my version of Opera uses the Presto version 2.10 renderer and core integration points up to and including 255 are in this build.

A core integration point is simply put a single task one or more engineers within core has been working on. These tasks get assigned a number (sequentially) as it get accepted and integrated onto our mainline. It can be anything from for example "XMLHttpRequest Level 2" which is c-i-232 and which among other things enabled CORS for XHR, or it can simply be a package of bugfixes going into mainline.

All this is also described at length in Web specifications if you want to read more about it.

So the next time you talk to an Opera engineer, surprise us by knowing the secret handshake and say "Hey Erik, I just tried my cool WebGL app in core integration XXX and it seems like YYY is borked. You know anything about that?"

Looking forward to hearing from you...now I'm going to go check out the WebGL panel at SXSW!

## WebGL 101 video

Four score and seven days ago I set out to record a little video of some simple WebGL samples I had made. Originally I made the samples just to have a little introductory WebGL get-together at work, but when I had written them I figured "I might just as well whip a movie together for people outside of the office". It turned out to be a massive undertaking (for being a hobby project) with writing, recording and editing. I'm glad I didn't realize just how much work it would be or I probably would never have started it. I have edited movies before... minute long teasers and trailers... not feature film length videos like this one turned out to be.

## SVG is awesome, but not the tools

I've been working on a WebGL tutorial movie on my spare time and I wanted to make a slide to show some bits of the OpenGL timeline that are relevant for WebGL. I tried a few options, but ended up doing it in SVG. Both diagram and background. Personally I still think the tools out there are a bit awkward to use. My normal work flow is quickly hashing it out in svg-edit then grabbing the source and hand editing it until it's the way I want it. Check out the background svg... soopah neat. That's so much nicer than slapping a jpeg in the background. 1,078 bytes uncompressed.

```<?xml version="1.0"?>

<svg viewBox="0 0 2880 1800" xmlns="http://www.w3.org/2000/svg" preserveAspectRatio="none">
<defs>
<style>
circle {
fill: black;
stroke-width: 1;
stroke: #111;
}
</style>
<pattern id="pat" patternUnits="userSpaceOnUse" viewBox="0 0 16 16" width="27" height="27">
<circle cx="8" cy="8" r="4"/>
<circle cx="16" cy="16" r="4"/>
<circle cx="0" cy="0" r="4"/>
<circle cx="0" cy="16" r="4"/>
<circle cx="16" cy="0" r="4"/>
</pattern>
<stop offset="0%" stop-color="#555"/>
<stop offset="20%" stop-color="#444"/>
<stop offset="40%" stop-color="#333"/>
<stop offset="100%" stop-color="#111"/>
</defs>
<rect fill="url(#grad)" x="0" y="0" width="2880" height="1800"/>
<rect fill="url(#pat)" x="0" y="0" width="2880" height="1800"/>
</svg>
```

The actual slide is an html with an inlined svg. Another 2,165 bytes making the whole slide land somewhere just over the 3k mark.

```<!doctype html>
<html>
<style>
html,body,svg {
width: 100%;
height: 100%;
}
svg {
display: block;
}
body {
margin: 0px;
background-image: url("background-industrial.svg");
background-size: cover;
}
circle,path {
stroke: white;
stroke-width: 5;
fill-opacity: 0;
}
text {
fill: white;
stroke-width: 0;
font-size: 80px;
font-family: tahoma;
}
</style>
<body>
<svg viewBox="0 -400 3050 200" xmlns="http://www.w3.org/2000/svg" xml:space="preserve">

<title>OpenGL timeline</title>

<text x="1000" y="-800" style="font-size:150px">OpenGL timeline</text>

<g>
<title>Developed by SGI, direct mode, fixed function pipeline.</title>
<circle r="10" cx="300"/>
<text x="200" y="100">1992</text>
<text transform="translate(350,-50) rotate(-60)" >OpenGL 1.0</text>
</g>

<g>
<title>Successive improvements like multitexturing, cubemaps, VBO.</title>
<circle r="10" cx="500" style="stroke:gray"/>
<text style="fill:gray" transform="translate(550,-50) rotate(-60)" >OpenGL 1.X</text>
</g>

<g>
<title>Programmable pipeline, GLSL</title>
<circle r="10" cx="1000"/>
<text x="900" y="100">2004</text>
<text transform="translate(1050,-50) rotate(-60)" >OpenGL 2.0</text>
</g>

<g>
<title>Programmable pipeline, OpenGL ES SL</title>
<circle r="10" cx="1600"/>
<text x="1500" y="100">2007</text>
<text transform="translate(1650,-50) rotate(-60)" >OpenGL ES 2.0</text>
</g>

<g>
<title>Direct mode and fixed function dropped</title>
<circle r="10" cx="2000" style="stroke:gray"/>
<text x="1900" y="100" style="fill:gray">2009</text>
<text style="fill:gray" transform="translate(2050,-50) rotate(-60)" >OpenGL 3.1</text>
</g>

<g>
<title>OpenGL ES 2.0 for the web</title>
<circle r="10" cx="2400"/>
<text x="2300" y="100">2011</text>
<text transform="translate(2450,-50) rotate(-60)" >WebGL 1.0</text>
</g>

<path d="m50,0 l600,0 l50,50 l50,-100 l50,50 L2850,0 M2950,0 l-100,25 l0,-50 l100,25" />
</svg>
</body>
</html>
```

I'm no html/svg-wizkid so I'm sure there are a couple of improvements that could be made, like removing the preserveAspectRatio=none which is hiding a bug in 11.60, and removing the overflow:hidden which is the typical sweep-under-the-carpet-fix and really figure out why I get a v-scrollbar, but all in all I always get struck by how elegant and simple it gets when you do some svg stuff. It's a shame it's lacking in the tools area or I think it would be used so much more!

EDIT: So @erikdahlstrom (co-chair of W3C SVG WG and fellow Opera-gfx) came to the rescue on my scrollbar carpet sweeping note. As it turns out the svg element is an inline element and that in some way which is clearly beyond my box model skills causes a few extra pixels to be added to the body element. The solution is to either set the line-height to 0 for the body element or to do what I did which I think is the neater option:
```svg {
display: block;
}
```

## Fostering a programmer

This might be a tad on the personal side for this blog as I'm normally only writing about work things, but I felt obliged to follow up on my last post.

This Sunday morning (no church goers here I'm afraid) I decided to see if I could expose my daughter to the wonderful world of BASIC.

I certainly didn't want to force her so without saying anything I lifted down the TV on the living room floor, went upstairs and literally dusted off the C64, took it downstairs, hooked it up and tuned in the right channel.

Annika, recently awoken, immediately looked up from behind her coloring book and asked "What's that? What are you doing?". I said "This is my old computer that I had when I was your age. I'm going to try it out again.". She frowned and said "That's a computer? It looks more like a keyboard!" as she sat down next to me on the carpet.

[ It's at times like these you realize that you're from a different era than your kids... Like the first time Annika wanted to watch a movie at grandma's place. Grandma gave her a VHS cassette. Annika tried to pry it open but couldn't so she eventually turned to me asking "Dad, where's the disc?"

Or like on another visit to grandma when she asked me "Why is grandma's phone tied to the wall?" ]

For about an hour we played around with the old C64, starting with 1 PRINT "ANNIKA"; 2 GOTO 1, the infinite loop printing ANNIKA all over the screen, testing her math skills by using stuff like PRINT(7+5) and finally we sat down with the old game book and entered in "Death valley" together.

She played it a few times, going through the whole spectrum of emotions from "Boohoo it's too hard!", via proud smirk as she beat it the first time and all the way to "Sigh...is that all?" as she beat it the third time.

Was she bitten by the programming bug? No, I don't think so, but none the less I think it was a good experience and I'm not giving up that easily.

I think when I grew up, before the time of Internet there weren't as much digital entertainment readily available so I kept coming back to my book of BASIC games and eventually learned to make my own. Perhaps the fact that the difference between a commercial game and one you typed in yourself wasn't as great back then as it is now made it more inspiring to make your own games.

Why would kids of today be entertained for long by 50 line ASCII games when there's an abundance of polished and art directed games available for free? And why would they make their own games when whatever they come up with will look like crap when put next to any \$1 game on the app store?

I still have hope though. I think a Raspberry PI with two motors hooked up provides some appeal no flash game can combat. I will persevere. Even if my kids end up as hairdressers or construction workers I will make sure they have at least been given a better than average chance to pursue a career in computer science.

Let me get back to you in 10 years when she's decided what university to attend!

## I'm teaching my kids to code this weekend! - are you?

Raspberry PI has been getting a lot of press recently and I'm following the project with great interest. I really admire the people running the project...it's such a good cause.
Just the other day I was flicking through a photo album mom had recently given me... you know the old fashion type with physical pages to turn and photos made of paper and I came across this image.

It's me hacking away on my Commodore 64 back home in the kitchen. The C64 is hooked up to a black and white TV and you can tell it's easter from the coloured feather decorations mom always put in the windows around that time of the year.

The picture was dated as well and I started counting backwards and realized I was six years old. And really, that was nothing exceptional back then. In the early 80's most kids had a home computer. I was perhaps introduced slightly earlier than some since dad was a programmer at IBM and bought a Sinclair ZX81 for me and my brothers when I was just four.

Back in the 80's you could buy these books which contained small BASIC games, about a page or two long each, nicely illustrated and explaining the bits and pieces that made up the game. I owe a lot to one of those books.It was a book in Swedish (I presume it was translated from an English original) full of space games and it was titled plainly "SPACE GAMES for VIC, PET, SPECTRUM, ZX81, BBC, TRS-80 and APPLE".

One of the best things about it (though I didn't think so at the time) was that the games were riddled with typos, which of course meant that you couldn't just copy the source listing, you had to try to understand what it did to make it run!
The book also did an excellent job with explaining all the instructions like LET, GOTO, GOSUB etc plainly and often posted little problems for you to solve like "How would you make the valley longer?".
I guess it's slightly ironic the second instruction EVERYONE learned back then is the now often despised GOTO. The first one was PRINT and that was all you needed to be entertained for hours

One of my favourites was "Death Valley" where you steered an asterisk which was supposed to symbolize your "Single seated speed dart" space ship down a narrow valley which was simply made up of capital i's. You can see a "screenshot" on the front of the book, except they've "touched it up" by making it in color...it was all monochrome!

I guess there's no surprise I ended up working for a long time in the games industry.

So.. what's the point of all this nostalgia? Well, I realized my daughter who is now 7 years old is a full year older than what I was in that picture coding in BASIC.

She being a girl (sorry guys it's just a fact that girls are more mature for their age) should be ready to start her path down computer science lane!

Perhaps we who grew up back then were fortunate in that games weren't as readily available as they are now? Would my daughter have the patience to copy a 40 line code listing to play an ASCII game, or would she throw it aside, go get her iPhone or Chromebook (yes I know I work at Opera, but I got it for free) and fire up one of the thousands facebook-, flash- or html5-games available for free? Who knows... but I'll damn well try this weekend!

Imagine the advantage for future math studies of grasping what a variable is at age 7...or an equation...or just knowing that ArcTan is something you use to figure out angles with. How could I deny my kids that head start.

I'm dusting off the C64, and as soon as they're available I'll be buying a Raspberry PI board each for the kids (and one or two for myself). And we're going to solder our Gert boards together and hook up a motor and some LEDs to it. I'm sure my kids have skills I didn't have when I was their age, like pinch-zooming and emailing, but I'll be damned if I don't give them the same opportunity to learn programming like my parents did!

You have kids? What are you waiting for!

## requestAnimationFrame for smart(er) animating

Quite a while back Paul Irish wrote a good blog post entitled "requestAnimationFrame for smart animating" where he introduced the new API and also presented a shim that you could start using that would be future proof. Implementations are now catching up with the shim, but there are still a few lacking it and considering the longevity of old IE versions there will be for quite some time. It didn't take long until I realized the shim is easily (mis)used causing it to drift.

A simple example. Let's say you are using requestAnimFrame (the shim) to drive an animation that you're expecting to run at 60FPS. That gives you 16.7ms to draw each frame and looking at the shim that's exactly the timeout set if it's not implemented in the browser (1000 / 60). If you call that at the start of your frame and then do your rendering that's fine. But what happens if you by chance call requestAnimFrame AFTER you've done your rendering? Say for example that you're using up 15ms to render each frame. Then when it's time to draw a frame you'd first render for 15ms and then wait for 16.7ms before drawing the next frame. Each frame would then take 31.7ms and you'd only be able to get an FPS of 31 as compared to 60 if it's natively implemented.

I've seen this one time too many to let it slip. I should've done this the first time I was asked to check out some sample running slower in Opera, but procrastination is for cool kids.

EDIT: Thanks to Joel Fillmore for spotting that the time parameter to the callback was missing. While editing this I figured I might as well update it to include the cancelRequestAnimationFrame. I'm yet to see it used in a real world sample, but for completeness.

```(function() {
var lastTime = 0;
var vendors = ['ms', 'moz', 'webkit', 'o'];
for(var x = 0; x < vendors.length && !window.requestAnimationFrame; ++x) {
window.requestAnimationFrame = window[vendors[x]+'RequestAnimationFrame'];
window.cancelRequestAnimationFrame = window[vendors[x]+
'CancelRequestAnimationFrame'];
}

if (!window.requestAnimationFrame)
window.requestAnimationFrame = function(callback, element) {
var currTime = new Date().getTime();
var timeToCall = Math.max(0, 16 - (currTime - lastTime));
var id = window.setTimeout(function() { callback(currTime + timeToCall); },
timeToCall);
lastTime = currTime + timeToCall;
return id;
};

if (!window.cancelAnimationFrame)
window.cancelAnimationFrame = function(id) {
clearTimeout(id);
};
}())
```

It's certainly a little bit wordier than the original shim, and perhaps a bit trickier to read, but it does its best effort to account for drift and you can call it wherever you want in your update loop.

The built in timer is of course never quite high rez enough when you do stuff like this, not to mention what it's like when you're running it on battery power on a laptop, but that's a whole different story.

## All hail iOS 5

I'm not going to bash Apple so don't worry if you're a fanboi, you can read on...I have at least 7 Apple devices at home that are frequently used.

iOS 5 is here, there is much rejoicing in the streets! While I followed flowery prose comments on twitter about how blazingly fast everyones HTML5 got with this update I was silently muttering words a family father will have to just mutter instead of saying out loud. My neat little iOS game Emberwind had gone from running perfectly smooth on any prior iOS version to a stuttering sluggish mess on iOS 5.
The first choice for a programmer is of course always to blame the OS or gfx-driver or keyboard or whatever seems remotely plausible before scrutinizing your own code... to be perfectly honest it did seem reasonable to do so this time... I hadn't changed a thing and it went from 30 FPS to 10-ish FPS.
I think I muttered more than once "POS iOS 5.. don't they do any automated regression testing on their releases at Apple!?!".

I don't know the answer to that question... I'd be interested to hear, but I'm guessing not or they would've spotted this and let people know.

I'm not going to ramble too much, let's dive into the solution... precision!

Yes, you heard right... precision, you know the little precision qualifiers everyone who's starting out with some WebGL or GLSL copies from a sample, that little seemingly insignificant
`precision highp float;`
Cause let's face it...everyone always picks highp!
What it does is to determine how much precision the GPU should use for your floats and ints etc. At least 2^-8 for lowp, 2^-10 for mediump and 2^-16 for highp when we're talking floats.

So how did this affect Emberwind? Well, it's a 2d game so you're mostly only moving textured clip space quads so there really is no need to use anything but lowp (EDIT: Re-read my post when it was mentioned on the WebGL mailing list and I actually meant to say mediump here). Or at least that's what I thought I was doing... turns out there was one place, in the vertex shader where I hadn't specified precision. The GLES Shading Language spec clearly says

The vertex language has the following predeclared globally scoped default precision statements:
precision highp float;
precision highp int;
precision lowp sampler2D;
precision lowp samplerCube;

The fragment language has the following predeclared globally scoped default precision statements:
precision mediump int;
precision lowp sampler2D;
precision lowp samplerCube;

Which meant I really were using highp in the vertex shader. Now the regression in iOS 5 appears to be that using a highp varying in the vertex shader will promote any mismatched varying in the fragment shader to highp thereby causing a major performance hit. You can see a profile run of the game using highp struggling to reach 13-14 FPS. (Note how lovely quiet it is on the CPU except when loading)

Using lowp in the second sample it sits quite comfortably at 33 FPS. It's a quite staggering difference. And if you say "You call 30FPS smooth?" Then my only defence is... well...it's a port...they're always second class citizens! You're so damn tired of the game by the time you get to ports so you're bound to take a few shortcuts. 30 FPS seemed like a reasonable one.

So, summing up... dropping from 33 to 13 fps because of precision! It may not matter much on desktop graphics hardware, but it apparently very much does on mobile... that goes for you WebGL-ers as well... not just native app devs!

Now I just need to send my one line optimization off to Chillingo and get it all fixed on the appstore!

And finally.. I'm speaking at New Game Conference in SF together with heaps of other talented speakers. It'll be awesome... and you should come! Free Chromebooks for everyone if they sell out 97% of the tickets!

## Emberwind and what's next

Hi again, it's time for a long overdue update on Emberwind. As usual, real life (tm) happens in waves and after Siggraph I was swept away by a big one and I'm up late again trying to catch up on things. My list of things I'd like to do which I always have scribbled down in my trusty old-school notepad just keeps getting longer and longer. In fact, in the process of pushing the latest update to the Emberwind github pages I wrote another item down on my list. I noted that the resource JSON file for Emberwind is quite wordy, taking up about 1.7MB and just removing all the whitespace shrinks it down to 700k. So, I did a quick google search for a JSON minifier. I found one by Crockford and a few others. I ran them all and looked with disappointment on my 7 idle cores while processing the file. One of them, written in Python allowed me to kick it off, go take a long shower and come back only to find it still working with the file. Rightfully it got a swift Ctrl-C followed by a Shift-Del to make sure it didn't linger in the trashcan.

So, the item added to my notepad was "Write C multicore enabled JSON minifier". I haven't yet figured out how I'll manage that, the gut feeling was to partition the file in X bits, search for the first recognizable JSON construct (or potential JSON construct) and process the segments in parallel. I think the only certain JSON construct you can find when starting at a random location are the escaped characters in a string, like
`\" \\ \/ \b \f \n \r \t \uXXXX`
That may be a bit limiting though so I guess you could also gamble on things like { and [ but that means you will have to throw away the results if it doesn't line up (i.e. is inside a string).

Well anyways, it'll be an interesting task. If there's already one out there that I don't know of please let me know. And should you beat me to it then that's great... the todo list is growing a bit too long in any instance.

Anyways, there's quite a large update to Emberwind with heaps of new game objects. Tutorial cards that should explain the controls etc. It runs great. The internal build of Opera I tested it in today runs at close to 120 fps. Not bad at all.

So what's next for Emberwind? Well I don't expect I'll be adding much more gameplay to it, but I do plan on turning it into a full fledged canvas performance test. What I want to do is to get rid of whatever javascript overhead there is in the demo by adding another render device implementation to it that just records all the draw-calls in a long array and dumps that to a file. I'll then write a really simple little js-app that reads the array and executes the draw-calls as quickly as it can, much like the Quake demo's. It'll be a great tool to use to profile and optimize Opera with.

I'm also hopefully going to go to Austin next year to hold a talk about Emberwind and html5 game making at SXSW. Please take a moment and go check out my suggestion and give it a thumbs up if you'd like to see me there.

I'm going to be at the New Game Conference in San Francisco on November 2nd to do a talk on Emberwind (1pm). I'm really looking forward to that. There are a lot of great speakers there, Ken Russel (WebGL WG Chair) and Ben Vanik (author of WebGL Inspector) from google are going to be there to do a talk I'm really looking forward to, and of course EA sending Richard Hilleman is a great boost for making people take html5 as a game development platform seriously. There are many, many more great speakers, so if you have the time, go check it out!

I'll probably also be at the HTML5 party in Madrid... not entirely sure what the topic is for that talk yet, but we'll see

I think that's it for now. Don't hang around here, go make html5 games!