Industrial Arithmetic

TL;DR - it's possible to use the Z-buffer to incrementally build a signed distance field representation of a 3D scene as the camera moves around, which you can then trace rays through for approximate collision detection, reflections, ambient occlusion, GI etc.

The idea for this came from Bart Wronski's nice article "The Future of Screenspace Reflections". At the end, he mentions the idea of somehow caching geometric information between frames to give additional information to improve the screen space ray marching.

The obvious surface representation to use seemed to be signed distance fields (SDFs), which have many nice properties, as discussed at length elsewhere. They can be traversed quickly using so-called "Sphere Tracing", which is basically distance-enhanced ray marching.

This reminded me of KinectFusion, which is a clever system which fuses together noisy depth images from the Kinect sensor to incrementally build a smooth 3D model of a scene as you move the camera around. It uses a so-called "truncated" signed distance field (stored in a 3D texture) to represent the scene. This technique was actually first described in a much older paper from 1996 about off-line alignment of depth images from 3D scanners: "A Volumetric Method for Building Complex Models from Range Images".

In real-time graphics, of course, we already have a high-quality depth image, and we know the exact camera position, so something like this ought to be a lot easier for us, right?

It's pretty simple to implement - you create a 3D SDF texture, initialise it to the maximum distance, and then for each frame execute a compute shader, launching one thread for each voxel in the SDF, and for each voxel:

calculate the world space position of the voxel
project this position into the camera space
read the depth from the Z buffer at this position
calculate the distance from the voxel to the depth buffer sample
convert this to a "truncated" distance, or just do some math to convert the surface into a thin 3D slab (this is what I do currently)
potentially do some kind of clever averaging with the existing distance value
profit.

It turns out this works pretty well.

After a single frame, it will give similar results to any other screen space technique, but as you walk around the scene, the details and occluded parts of the SDF get filled in, and they are still maintained outside the view (at least to a certain extent).

The video shows a simple prototype done in Unity DX11. The grey geometry is the original scene, the coloured image is the ray-marched SDF (visualized as the normal calculated from the SDF gradient). At the beginning, it shows a few individual depth frames being added to the SDF. Then it switches to continuous updates as the player moves through the arch. Then continuous updates are switched off so you can see the SDF quality as the player walks back through the arch.

You can move the 3D texture to follow the player as they move around the scene, scrolling the contents of the 3D texture. You probably don't want to do this every frame, because the repeated filtering of the volume will cause the details to blur out, but you can do it in blocks, every time the player has moved a certain distance.

For reflections, this technique would only show reflections of nearby surfaces that you have looked at previously, which is kind of wacky. You could initialize the SDF around the player by rendering in the 6 cube directions to avoid this problem.

People have used the Z-buffer for approximate particle collisions (see, for example, Halo Reach Effects Tech). The problem with this is the particles only collide against visible surfaces, and if you look away and then turn back, the particles will have fallen through the floor. With this technique, the particles will still be there. Maybe.

Problems

large 3D textures use quite a lot of memory, although we only need a single channel fp16 texture here, so it's not too bad. 128*128*128*2 = 4MB.
the limited resolution of the SDF means the scene isn't represented exactly. There are artifacts (e.g. see edge of tunnel arch in the video).
the mismatch between the resolution of the volume and the depth buffer means the depth buffer is usually under-sampled. Could generate mipmaps to help with this?

Ideas for improvements

Surfaces that are facing away from the camera (wihch cover a large depth range) cause problems / holes - weight these lower?
Use sparse (tiled) 3D textures (DX12) to avoid storing empty regions
Capture more than one depth value per pixel - second depth layer (possibly with a minimum separation like "Fast Global Illumination Approximations on Deep G-Buffers").
Use the SDF for improving/accelerating screen space ray marching
Store shaded color in separate 3D texture and use for blurry reflections.

Anyway, I hadn't seen this idea described anywhere before, so I thought it was worth recording here before I forgot it! I'd be interested in hearing what other people think.

I haven't posted here much recently, so here are a few details on a fun project I worked on recently.

I spent the three weeks before NVIDIA's GPU Technology Conference working on the graphics for the galaxy simulation demo shown above, which was shown during the keynote presentation. You can watch the video here. The (ambitious) goal was to achieve something that looked like these classic Hubble telescope images.

Much of the look of these images is due to bright star light being scattered by the surrounding dust, so we spent a lot of time trying to get the dust to look right. For efficiency, the dust particles are simulated separately - they are affected by the stars' gravity, but don't interact with each other. The colour of the stars is mainly due to variance in temperature / age, but we took some artistic license here.

The rendering was done using OpenGL, using a variant of my favourite technique used in the old CUDA smoke particles demo (originally due to Joe Kniss). The particles are sorted from back-to-front, and rendered in slices, first to an off-screen light buffer, and then to the screen (sampling the indirect lighting from the light buffer). The light buffer is blurred after each slice to simulate scattering. Obviously this only simulates light scattering towards the camera, but this isn't a bad approximation in practice. The dust particles are drawn larger and less opaque than the stars.

I also added a lot of post-process glow (which makes everything look better), and a cheesy star filter (see below), which they made me remove in the end!

Anyway, most of the credit for the demo should go to Jeroen Bédorf and Evghenii Gaburov, who wrote the Bonsai simulation code, which you can read about here. Props also to Mark Harris (who did a lot of the optimization), and Stephen Jones, who did the CUDA dynamic parallelism implementation (which is pretty cool, by the way).

The biggest regret I have is not doing proper anti-aliasing for the stars that were smaller than a pixel. On a 60 x 20 foot screen, each pixel was about the size of a sugar cube and you could see them crawling from pixel to pixel!

Industrial Arithmetic

Tuesday, 4 November 2014

Depth Buffer Fusion for Real Time Effects

Tuesday, 22 October 2013

Wednesday, 30 May 2012

Rendering Galaxies

Links