Skip navigation.

exploreopera

| Help

Sign up | Help

Software Development

Correcting The Future

D3D: Drawing Transparency Front to Back

In a previous article, I showed how you can draw from front to back using OpenGL. Here is the final set of equations that we used for blending. S is the source and D is the destination.

D[B] += S[B] * D[A]
D[G] += S[G] * D[A]
D[R] += S[R] * D[A]
D[A] *= S[A]


Source and destination are in precomputed format. A precomputed format is where you apply the following transformation.

B *= A
G *= A
R *= A
A = 1-A


A on the right hand side is the original Alpha value. All values are relative to 255. So 128 for alpha would actually be 128/255 in all the above equations. D3D takes care of all this for you automatically.

You do not need to convert all your images beforehand to precomputed format. Instead, you can use the fixed function pipeline to precompute your image on the fly.

  // Set texture for texture stage 0
  pD3DDevice->SetTexture(0,pMyTexture);

  pD3DDevice->SetTextureStageState(0,D3DTSS_COLOROP,D3DTOP_MODULATE);
  pD3DDevice->SetTextureStageState(0,D3DTSS_COLORARG1,D3DTA_TEXTURE|D3DTA_ALPHAREPLICATE);
  pD3DDevice->SetTextureStageState(0,D3DTSS_COLORARG2,D3DTA_TEXTURE);
  pD3DDevice->SetTextureStageState(0,D3DTSS_ALPHAOP,D3DTOP_SELECTARG1);
  pD3DDevice->SetTextureStageState(0,D3DTSS_ALPHAARG1,D3DTA_TEXTURE|D3DTA_COMPLEMENT);

  pD3DDevice->SetTextureStageState(1,D3DTSS_COLOROP,D3DTOP_DISABLE);
  pD3DDevice->SetTextureStageState(1,D3DTSS_ALPHAOP,D3DTOP_DISABLE);


This should work on ALL video cards no matter how old. Then we would apply the alpha blending equations.

  pD3DDevice->SetRenderState(D3DRS_SRCBLEND, D3DBLEND_DESTALPHA);
  pD3DDevice->SetRenderState(D3DRS_DESTBLEND, D3DBLEND_ONE);
  pD3DDevice->SetRenderState(D3DRS_ALPHABLENDENABLE, TRUE);
  pD3DDevice->SetRenderState(D3DRS_SRCBLENDALPHA, D3DBLEND_DESTALPHA);
  pD3DDevice->SetRenderState(D3DRS_DESTBLENDALPHA, D3DBLEND_ZERO);
  pD3DDevice->SetRenderState(D3DRS_SEPARATEALPHABLENDENABLE, TRUE);


Right now, we should get correct results on screen.

Here, we have to be careful. There are three kinds of video cards when it comes to blending operations.
Here are the blending equations.

renderTargetColor = (colorin* srcColorBlendOp) ColorBlendOp (colorrt* destColorBlendOp)
renderTargetAlpha = (alphain* srcAlphaBlendOp) AlphaBlendOp (alphart* destAlphaBlendOp)


Most cards can set all six blending ops independently. But many older cards cannot. We'll classify srcColorBlendOp, destColorBlendOp, srcAlphaBlendOp and destAlphaBlendOp as argument blending ops. ColorBlendOp and AlphaBlendOp are function blending ops.

Some of the oldest cards cannot have different blending ops for alpha. So the Alpha equation must be identical to the Color equation. With these cards, you need to use two passes. That means this method isn't so useful. But if you have one of these older cards, maybe forking over $8 wouldn't be a bad idea to get the required functionality.

OpenGL can tell the difference between cards that can set separate argument ops and those that can set separate function ops. Unfortunately, D3D only lets you check if you can set separate ops for BOTH arguments and functions. And this causes a slight problem because some cards can set separate argument ops for alpha, but they cannot set a separate alpha function op. These cards can still use our method since the function ops are the same, so it'd be nice if we could detect them. And indeed, there is a way, even if not documented.

What you do is set D3DRS_SEPARATEALPHABLENDENABLE to TRUE. Don't try and read it back. It will likely return false. Then try setting different argument blending ops with D3DRS_SRCBLENDALPHA and D3DRS_DESTBLENDALPHA. Make sure you set them to something different than their COLOR counterparts. Read those values back and check if they actually got set correctly. If they did, then you're all set to go. Otherwise, you need to use two passes. Once for setting color values and once for setting the alpha.

The function op is ADD for both Alpha and Color, so no need to change it unless it was previously modified by your code.

BTW, I don't even check for this functionality. I just assume you have a card that at least supports separate argument ops. In the code though, I pretend that the card supports separate ops for all six values. Since the function op is the same, it'll work on all cards worth more than $5 brand new.

However, all of this is quite useless so far. It is no better than drawing back to front. All pixels get written all the time. To speed things up, we need to be able to skip writing the pixel out. Reading is cheap. Writing is expensive. So let's look at the pipeline that is important to us. The order is important here.

1. Pixel Shader or Fixed Function Pipeline
2. Alpha Test
3. Depth and Stencil Test
4. Alpha Blending
5. Output Masking (R,G,B,A)
6. Render Target Output

Output Masking is used if you want to use multiple passes with those ancient video cards that don't support separate argument ops.

So if we look at this, it'd be nice if we could skip Alpha Blending and the following stages if possible. When do we want to skip outputting a pixel? There are two cases, and that's where things get complicated.

1. Skip if source pixel is 100% transparent.
2. Skip if dest pixel is 100% opaque.

Transparent pixels don't change anything, so it's pointless to draw them. We can use the Alpha Test for this. If Alpha is 255, then we don't draw it. Remember that Alpha is inverted at this stage by the fixed function pipeline. That's why we check for 255 and not 0.

  pD3DDevice->SetRenderState(D3DRS_ALPHAREF, (DWORD)0x000000FF);
  pD3DDevice->SetRenderState(D3DRS_ALPHATESTENABLE, TRUE); 
  pD3DDevice->SetRenderState(D3DRS_ALPHAFUNC, D3DCMP_LESS);


So if the alpha is less than 255, it'll go through to the next stage. But if it's exactly 255 (transparent), we skip all following stages.

Already, we've gotten quite the speedup for transparent images.

The other test we'd like to do is check if the destination pixel has an alpha value of 0. But there's no way to do that. The only stage that reads the destination pixel is the blending stage and this comes after all tests. So how do we get around this problem? Well, for this we need to use pixel shaders. That means that anything you had in the fixed function pipeline will need to be rewritten in a pixel shader. However, this can provide better speed than before. In fact, all newer cards implement the fixed function pipeline within pixel shaders.

The reason we need a pixel shader is to write out a Z value that is related to the alpha value of our pixel. While there, you could also use the texkill instruction (clip in HLSL) to check for transparent pixels if you have pixel shader version 1.4 or above.

We want to set up the Z buffer so that if we write an opaque pixel, we cannot write any more to that location. To do this, we could set opaque pixels to a Z value of 1.0 and all other pixels to 0.0. We would then set the comparison op to GREATEREQUAL. So only if Z values are greater than or equal to what's already there will we overwrite the pixel. Let's write out all the possibilities.

1. Pixel Z: 0.0  Z Buffer: 0.0  (0.0>=0.0 is TRUE)
2. Pixel Z: 0.0  Z Buffer: 1.0  (0.0>=1.0 is FALSE)
3. Pixel Z: 1.0  Z Buffer: 0.0  (1.0>=0.0 is TRUE)
4. Pixel Z: 1.0  Z Buffer: 1.0  (1.0>=1.0 is TRUE)

// We can rewrite this as...
1. Source: transparent  Dest: transparent  (Blend!)
2. Source: transparent  Dest: opaque       (Don't Blend)
3. Source: opaque       Dest: transparent  (Blend!)
4. Source: opaque       Dest: opaque       (Don't Blend  PROBLEM!!!)


As we can see, there's a problem with the fourth possibility. We're drawing front to back. So if the front is opaque, we don't want to blend these pixels. It'll still look fine, but it's wasted computations. How can we fix this? It just so happens that we can't do anything using static Z buffer comparisons.

For the pixel shader technique, instead of using 1.0 for opaque pixels, we can try an incremental approach. Say we're using a 16bit Z Buffer, that's 65536 values for the Z Buffer. We would start at 1.0 as usual. For the next texture we draw, we would not use 1.0, but rather a slightly smaller value. For a 16bit Z Buffer, we would decrement by 1.0/65535.0 each time. This allows for drawing 65535 textures in one go. If you want more, use a 24 or 32 bit Z Buffer. If you want even more texture (crazy as that is), you can use a stencil buffer.

To accomplish this, you'd set a ZFAIL function for the stencil buffer so that any Z Buffer value that is not zero will set a non-zero value in the stencil buffer. You would draw a texture where the pixels don't matter and where you disable writing of RGBA values and disable writing to the Z Buffer as well. The polygon used for this texture should have Z values of zero. The Z comparison op should be EQUAL. So anything in the Z Buffer that isn't zero will fail which is what we want. STENCILREF should be 0. Stencil comparison op should be EQUAL. STENCILFAIL should be KEEP. STENCILZFAIL should be INVERT. STENCILPASS should be KEEP. After you've updated the stencil buffer, you would erase the Z Buffer ONLY. Then you can draw 65535 more textures for a 16 bit Z Buffer (more for 24 bit Z Buffer). Rinse and repeat.

What this does is setup a stencil buffer that replaces the Z Buffer from the last batch of textures. Here are the steps during drawing normally.

1a. If Stencil is 0, then no opaque pixels have been drawn during the last batch. PASS!
1b. If Stencil is not 0, then the pixel is already opaque. FAIL! Skip this pixel.
2a. If pixel's Z value is greater than or equal to the Z Buffer value, then we blend this pixel. PASS!
2b. If pixel's Z value is less than the Z Buffer value, then it means there's an opaque pixel on the screen already. FAIL!

For the update, we want to merge the Z Buffer into the stencil buffer. So any value that is not 0 in the Z Buffer should set the stencil buffer to 1 (or not zero).

1. If Stencil is not 0, then this pixel is already opaque from the last batch. No need to process it. FAIL!
2. If ZBuffer is not 0, then this pixel is opaque, so we update the stencil buffer and set it to 1 (or not zero by inverting the zero that was there before).
3. If the stencil is 0 (not opaque from last batch) and Z Buffer is 0 (not opaque from this batch), we keep the 0 in the stencil buffer. This is a transparent or translucent pixel.

There are many ways to get by the 65K texture limit for 16 bit Z Buffers. One alternative method that uses render to target functionality doesn't require a stencil buffer at all and doesn't require a pixel shader for the update. Set your render target as the source texture and use alpha testing to only allow opaque pixels. Make sure to clear the Z Buffer first and disable writing to RGBA. The entire source texture should have a Z value of 1.0 which means that any pixel that passed the alpha test will set the Z Buffer. Obviously, the Z comparison is ALWAYS. For the next batch, just make sure to not use 1.0 for pixel Z values, but instead use something a little smaller at the start. This method has the disadvantage that you have to find a dummy render target the same size as the original.

Yet another method would be that instead of clearing the Z Buffer, you could use a pixel shader to just output Z values of 1.0 for opaque pixels and 0.0 for the rest. You would also not use an alpha test.

Here's how to set the Z Buffer for normal use. You'll have to set the Z value for the pixel within the pixel shader.

  pD3DDevice->SetRenderState(D3DRS_ZFUNC, D3DCMP_GREATEREQUAL);
  pD3DDevice->SetRenderState(D3DRS_ZWRITEENABLE , TRUE);
  pD3DDevice->SetRenderState(D3DRS_ZENABLE, TRUE);


One last note. If you don't want to use a pixel shader, you can accomplish the same thing using two passes. The first pass writes out any opaque values (alpha test) and uses a different Z comparison op (GREATER) where the entire source texture has a Z value of 1.0. If the test passes, you'd also set the stencil buffer. The alpha test would only allow opaque pixels through for this first pass. For the second pass, alpha test would not allow 100% transparent pixels through. The Z value for the entire texture would be zero. You'd check that the stencil buffer is 0 to allow further processing of this pixel. Then you would compare the Z values (EQUAL) in order for the pixel to be written. You'd disable Z buffer writing and stencil buffer writing for this pass.

This last method basically writes opaque pixels and then writes translucent pixels.

For those that wanted to draw front to back, this should provide some useful insight to what's involved. It is drastically better than back to front even if you use alternate methods and multiple passes. And if you do get the optimal version working, you'll never write to the same pixel more often than is needed which will cause your software to significantly decrease its rendering time, and with transparencies as a bonus.

More on Project V's Type SystemLego Robot that Solves Rubik's Cube

Write a comment

Comment
(BBcode and HTML is turned off for anonymous user comments.)

Please type this security code : a39c56

Smilies

August 2008
SMTWTFS
July 2008September 2008
12
3456789
10111213141516
17181920212223
24252627282930