This time I thought I'd talk a little bit about alpha blending and premultiplied alpha. The whole post is really going to boil down to one thing, premultiplied alpha is good, but it's not commutative! But to get there let's cover some of the background (hee-hee).
Integral alpha was invented back in the 1970s by Ed Catmull and Alvy Ray Smith while at New York Tech, but really, the paper which is most often referenced when it comes to this is the 1984 paper by Thomas Porter and Tom Duff (also know for the Duff-device). In it they describe in addition to the 3-component (red, green, blue) pixel color the use of a fourth component to convey coverage information. It's referred to as the matte component or alpha as it's commonly called today. The most illustrative name would really be opacity as 0 means fully transparent, 1 means fully opaque and anything in between various levels of semi-transparency.
A load of croc
The standard blending operation is called "over". In OpenGL it's achieved by glBlendEquation(GL_FUNC_ADD) and glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA) which equates to the following formula for calculating the result of "source" being drawn over "destination":
result.rgba = source.alpha * source.rgba + (1 - source.alpha) * destination.rbga
[In OpenGL you need to first call glEnable(GL_BLEND) as it defaults to off. GL_FUNC_ADD is also the default so you can omit the call to glBlendEquation.]
By rendering the semi-transparent objects back to front you get the expected result. So, for example drawing semi transparent red (1,0,0,0.5) on top of opaque white (1,1,1,1) you calculate
r = 1 * 0.5 + (1 - 0.5) * 1 = 1 g = 0 * 0.5 + (1 - 0.5) * 1 = 0.5 b = 0 * 0.5 + (1 - 0.5) * 1 = 0.5 a = 0.5 * 0.5 + (1 - 0.5) * 1 = 0.75
The interesting bit here is that the resulting alpha goes from fully opaque to semi-transparent. This physical phenomenom is called Akin-Romney-Refraction and can be seen in this image. The white table serves as the opaque surface and by holding up the plastic sheet in front of it light gets refracted and you can see what's behind the table.
Ok, it doesn't take a genius to figure out that's a load of croc. In fact GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA doesn't correctly calculate the resulting alpha. It may sound a bit strange and you'd be right in asking yourself "Why is it used then?". Well, when rendering to the frame buffer it doesn't matter much as displays doesn't use the resulting alpha (anyone remember the Game & Watch with the see-through screen?). It does matter if you're rendering to an FBO and expect to blend it with something else though (think impostors etc), or if the framebuffer will be composited with something else like in WebGL where the canvas is composited with the rest of the web page.
Here's a sample in WebGL showing how destination alpha gets screwed up. A WebGL canvas by default has destination alpha, so when we're filling it first with opaque white and blending a semi transparent red quad on top of it we can see the black text behind the canvas shine through even though it should be fully obscured. Not what you'd expect in the real world.
Luckily though, Porter and Duff led the way to something better and introduced premultiplied alpha. Take a look at the formula again:
result.rgba = source.alpha * source.rgba + (1 - source.alpha) * destination.rbga
You can see that the source components are always multiplied with the source alpha. That actually means that it can be done in a preprocessing step, so the image data is stored premultiplied. That changes the formula to do blending to:
result.rgba = 1 * source.rgba + (1 - source.alpha) * destination.rbga
So in OpenGL speak we need to use GL_ONE, GL_ONE_MINUS_SRC_ALPHA with premultiplied alpha.
If we assume we have premultiplied indata there's something important to note about it. If each component goes from 0 to 1 and they have been multiplied by the alpha component (also in the 0-1 range) then it follows that the red, green and blue components must be less or equal to the alpha component. So while (1,0,0,0.5) represents a 50% transparent full red in straight alpha it's premultiplied counterpart would be (0.5,0,0,0.5).
If we're dealing with 32-bit pixels then each component ranging from 0 to 1 is represented by a byte or 256 distinct values from 0 to 255. So if full-bright red at 50% opacity is represented by 128 in premultiplied form then there are only 128 possible levels of red available. So at 50% opacity colors are less accurately described than at 100%. In fact the closer to fully transparent or 0 alpha we get the less accuracy we get in the color. When alpha is ~0.4% (or byte value 1) then the only red color we can describe is no red at all or all red. In practice it doesn't matter much though as the multiplication would have to be performed as soon as the blending is calculated anyway.
So, let's look at our example in premultiplied form, drawing semi transparent red (0.5,0,0,0.5) on top of opaque white (1,1,1,1) we calculate
r = 1 * 0.5 + (1 - 0.5) * 1 = 1 g = 1 * 0 + (1 - 0.5) * 1 = 0.5 b = 1 * 0 + (1 - 0.5) * 1 = 0.5 a = 1 * 0.5 + (1 - 0.5) * 1 = 1
That's more like it. That's an OPAQUE light red color. Go premultiplied!
Let's look at another issue with alpha blending, that of alpha blending in combination with texture filtering. Let's assume we're looking at the very edge of something in a cut-out image, perhaps the left edge of the O in the Opera logo. On the right side of the edge there's opaque red (1,0,0,1), on the left side of the edge there's fully transparent (0,0,0,0). Now, just for the sake of illustration, instead of using an Opera-O, we'll use just the problematic piece of the image, a 2x1 large image containing a fully transparent pixel and an opaque red pixel to the right of that. Now, if we render that pixel aligned everything is just fine, we'll have a crisp and sharp edge, but what if we render it off pixel center, or even better for illustrative purposes we blow it up to render as 256x100 pixels. The graphics library then needs to apply texture filtering to the image.
You can see how there seems to be a blackness to the gradient. This is because when filtering is applied to generate smooth alpha values around the edge it also interpolates the rgb values which are going from (0,0,0) to (1,0,0). Remember in the straight alpha model full colored red is always represented with 1 in the red channel, regardless of it's opacity.
On the other hand in the premultiplied alpha model full bright red will be represented by the same value in the red component as the alpha component. Hence there will be no filtering errors when using premultiplied alpha. The WebGL sample shows the difference clearly side by side. You might think "Sure, it's because he picked transparent to be (0,0,0,0). If he would have picked (1,0,0,0) as transparent then it would've looked good." That's certainly true in this particular case, but for a real world image with several different colors there's just no way to make that work.
Associative != Commutative
Premultiplied alpha blending is associative, meaning A ⊗ (B ⊗ C) = (A ⊗ B) ⊗ C.
This isn't a full analytical proof, see Wikipedia for that. But it's easy to verify that it is indeed associative by expanding the formulas below and checking that they actually amount to the same expression.
A ⊗ (B ⊗ C) A.rgba + (1 - A.a) * (B.rgba + (1 - B.a) * C.rgba) ⇔ A.rgba + (B.rgba + C.rgba - B.a * C.rgba) - A.a * (B.rgba + C.rgba - B.a * C.rgba) ⇔ A.rgba + B.rgba + C.rgba - B.a * C.rgba - A.a * B.rgba - A.a * C.rgba + A.a * B.a * C.rgba
(A ⊗ B) ⊗ C (A.rgba + (1 - A.a) * B.rgba) + (1 - (A.a + (1 - A.a) * B.a)) * C.rgba ⇔ A.rgba + B.rgba - A.a * B.rgba + (1 - (A.a + B.a - A.a * B.a)) * C.rgba ⇔ A.rgba + B.rgba - A.a * B.rgba + (C.rgba - (A.a * C.rgba + B.a * C.rgba - A.a * B.a * C.rgba)) ⇔ A.rgba + B.rgba + C.rgba - B.a * C.rgba - A.a * B.rgba - A.a * C.rgba + A.a * B.a * C.rgba
An important point to make though is that associativity is not the same as commutativity. If the premultiplied blending operator would be commutative then you could draw semi transparent elements in any order and still get the right result, but that is not the case! The simplest possible proof for that (since we know premultiplied alpha behaves like in the real world with proper resulting alpha) is the fact that if that was true then draving an opaque black color over an opaque white color would result in the same color as if you drew opaque white over opaque black!
A ⊗ B ⊗ C ≠ C ⊗ B ⊗ A
Just to give another visual example check out this WebGL sample showing a series of blends. The top row draws red, green and yellow at 50% opacity on top of opaque black. In the middle row, the order of the red, green and yellow is reversed and you can see how it's clearly not producing the same result. The bottom section shows how green and yellow is first blended in one buffer before being used to blend on top of the black and red. So, associativity works, commutativity doesn't.
Premultiplied and color > alphaIf you're a curious mind you might've already wondered what would happen if you specify a color where the color components are greater than the alpha component? You get something very commonly used for particle effects, referred to as additive alpha. Check out this WebGL demo which draws 100 random squares with the color (0.15, 0.10, 0.07, 0.05) on top of each other for a sort of fiery effect. Don't do particle systems like this though!! It's just for illustrative purposes.
So, that was all I had for this time, a quick survey of alpha blending. A shout out goes to @tim_johansson for reading this post and giving me some comments on the readability. Until next time!