OpenGL Depth Sorting Resolution

by James W. Walker

Depth sorting is the process by which OpenGL decides whether one triangle is in front of another. If the triangles are too close together, the process may fail. Whether depth sorting succeeds or fails depends on a number of variables: The distances of the triangles from the camera, the distances of the near and far clipping planes from the camera, and the size of the depth buffer. This article explores the relationship between these variables and depth sorting. See http://www.opengl.org/resources/faq/technical/depthbuffer.htm#dept0045 for a similar discussion.

Review of Coordinate Systems in OpenGL

The viewing transformation transforms object coordinates into eye coordinates, also known as view coordinates. The eye coordinate system has its origin at the camera location (the "eye"), x axis to the right, y axis up, and (to make the coordinate system right-handed) -z axis in the direction that the camera is looking.

The volume visible to a perspective camera is a frustum of a pyramid, bounded by 4 planes on the left, right, top, and bottom, and by the near and far clipping planes. The projection matrix maps this volume to a box in clip coordinates, also called frustum coordinates, in which x, y, and z each range from -w to w. Division of x, y, and z by w, called perspective division, brings us to normalized device coordinates, where x, y, and z range from -1 to 1.

The viewport transformation maps normalized device coordinates to window coordinates. In particular, the z coordinate maps into the range 0 to 1. These z values are then stored in the depth buffer for depth sorting. Notice that the range 0 to 1 means that the depth buffer does not need to store a sign bit or an exponent, unlike the usual case with floating-point numbers.

Window Depth as a Function of Eye Depth

The projection matrix defined by gluPerspective is:

$$\left[\matrix{\it fovy\over aspect & 0 & 0 & 0\cr 0 & fovy & 0 & 0\cr 0 & 0 &\it zFar + zNear \over zNear - zFar &2\;\it zFar\ zNear \over zNear - zFar\cr 0 & 0 & -1 & 0}\right]$$

Therefore a point with eye coordinates $(\bullet,\bullet, -d, 1)^T$ is transformed to the point in clip coordinates

$$\left(\bullet,\bullet, {- d (zFar + zNear) + 2\, zFar\ zNear\over zNear - zFar}, d\right)^T$$

The perspective division brings us to normalized device coordinates

$$\left(\bullet,\bullet, {zFar + zNear\over zFar - zNear} - {2\, zFar\ zNear\over d(zFar - zNear)}, 1\right)^T.$$

When we transform normalized device coordinates to window coordinates, the depth coordinate is transformed by the linear function $x \mapsto x/2 + 1/2$, producing

$$\left(\bullet,\bullet, {zFar + zNear\over 2(zFar - zNear)} - {zFar\ zNear\over d(zFar - zNear)} + 0.5, 1\right)^T.$$

Therefore, window depth as a function of eye depth is given by the formula

$$f(d) = {zFar + zNear\over 2(zFar - zNear)} - {zFar\ zNear\over d(zFar - zNear)} + 0.5$$

You can easily check that $f( zNear ) = 0$ and $f( zFar ) = 1$.

Estimating Depth Resolution

Consider two triangles at eye coordinate depths $d_1$ and $d_2$, differing by $\Delta\,d$. I want to estimate how large $\Delta\,d$ can be in a situation where the two triangles cannot be reliably sorted by the depth buffer.

If the depth buffer is $k$ bits deep, then the minimum significant difference in window depth is $\Delta\,f = 2^{-k}$. To relate $\Delta\,d$ to $\Delta\,f$, we can use the approximation for the derivative,

$$f^\prime(d) \approx {\Delta\,f \over \Delta\,d}$$

$$\Delta\,d \approx {\Delta\,f \over f^\prime(d)} .$$

The exact formula for the derivative is

$$f^\prime(d) = {zFar\ zNear \over d^2 (zFar - zNear)} ,$$

hence we obtain

$$\Delta\,d \approx 2^{-k} {d^2 (zFar - zNear) \over zFar\ zNear} .\tag1$$

Clearly, the worst case is near the far plane, where this reduces to

$$\max\ \Delta\,d \approx 2^{-k} {zFar(zFar - zNear) \over zNear} .$$

Typically, the ratio of $zFar$ to $zNear$ is large, so we do not lose much by using the cruder estimate

$$\max\ \Delta\,d \approx 2^{-k} {zFar^2 \over zNear} .\tag2$$

Exact Calculation of Depth Resolution

We want to find the largest value of $\Delta d$ such that

$$f(d) - f(d - \Delta d) = \Delta f \tag3$$

and such that $d$ and $d - \Delta d$ are restricted to the interval from $zNear$ to $zFar$.

We can express the function as

$$f( x ) = a - b/x$$

where

$$b = {zFar\ zNear \over zFar - zNear}.$$

Now let us solve equation (3) for $\Delta d$.

$$a - b/d - (a - {b\over d - \Delta d}) = \Delta f$$

Cancel the a.

$${b\over d - \Delta d} - {b\over d} = \Delta f$$

Divide through by b.

$${1\over d - \Delta d} - {1\over d} = {\Delta f \over b}$$

Solve for $d - \Delta d$.

$${1\over d - \Delta d} = {1\over d} + {\Delta f \over b} = {b + d \Delta f \over b d}$$

$$d - \Delta d = {b d \over b + d \Delta f}$$

And finally we can obtain $\Delta d$.

$$\Delta d = d - {b d \over b + d \Delta f} = {d b + d^2 \Delta f - b d \over b + d \Delta f} = {d^2 \Delta f \over b + d \Delta f}\tag4$$

If we write this as

$$\Delta d = {d \Delta f\over b/d + \Delta f}$$

then it is clear that the numerator is an increasing function of $d$ and the denominator is a decreasing function of $d$, so the maximum occurs at the right hand end of the interval, $d = zFar$. If we substitute in $d = zFar$, $\Delta f = 2^{-k}$, and the defined value of $b$, we have:

$$\max\ \Delta d = {zFar\ 2^{-k}\over {zNear\over zFar - zNear}+2^{-k}}$$

or equivalently

$$\max\ \Delta d = {zFar (zFar - zNear) 2^{-k}\over zNear + (zFar - zNear) 2^{-k}}.\tag5$$

If we assume that the ratio $zFar/zNear$ is much larger than 1 but much smaller than $2^k$, then we can see that the approximation (2) is a reasonable approximation to equation (5).

Infinite Far Plane

When using the stencil volume technique to render shadows, it is useful to set the far plane at infinity. In that case, $\max\ \Delta d$ clearly becomes infinite, but let's see what happens to depth resolution at a fixed depth. As $zFar$ increases toward infinity, our sub-formula $b$ approaches $zNear$. Therefore, equation (4) approaches a finite limiting value:

$$\Delta d_\infty = {d^2 \Delta f \over zNear + d \Delta f}$$

Now let's consider a concrete example. Suppose we have a 24-bit depth buffer, so $\Delta f = 2^{-24}$. If $zNear = 1$, $zFar = 100$, and $d = 50$, then $Δd ≈ 0.0001475$. If we change $zFar$ to infinity, then $\Delta d ≈ 0.0001490$, a rather small change. Whereas, if we leave $zFar$ at 100 and decrease $zNear$ to 0.5, then at $d = 50$ we find that $\Delta d$ increases to about 0.0002965. I would suggest that this is typical of computer graphic situations: increasing the far plane has little effect on depth resolution at a given depth, while the near plane has much more impact.