Shadow Generation Hardware

Tomas Akenine-Möller

The shadow volume algorithm is a popular technique for real-time shadow generation using graphics hardware. Its major disadvantage is that it is inherently fillrate-limited, as the performance is inversely proportional to the area of the projected shadow volumes. We present a new algorithm that reduces the shadow volume rasterization work significantly. With our algorithm, the amount of per-pixel processing becomes proportional to the screen-space length of the visible shadow boundary instead of the projected area. The first stage of the algorithm finds 8x8 pixel tiles, whose 3D bounding boxes are either completely inside or outside the shadow volume. After that, the second stage performs per-pixel computations only for the potential shadow boundary tiles. We outline a two-pass implementation, and also describe an efficient single-pass hardware architecture, in which the two stages are separated using a delay stream. The only modification required in applications is a new pair of calls for marking the beginning and end of a shadow volume. In our test scenes, the algorithm processes up to 11.5 times fewer pixels compared to current state-of-the-art methods, while reducing the external video memory bandwidth by a factor of up to 17.5.