Ray Tracing with Vulkan

Introduction

VK_NV_ray_tracing

Introduction

For the specialization course at The Game Assembly I wanted to work with a subject that was at the cutting edge of real time graphics: Ray tracing.

My goal was to have a renderer utilizing physically based shading for opaque objects and driven by ray tracing in order to get accurate reflections and shadows. I chose Vulkan over DirectX 12 solely for the reason of me already having built a barebones yet functional Vulkan engine (featuring simple 3D rasterization), giving me the much-needed framework for exploring more complex techniques.

Descriptors

VK_NV_ray_tracing

Vulkan doesn’t support ray tracing by default but must instead be extended with the VK_NV_ray_tracing extension from Nvidia (prior to 17/03-2020).

As a starting point I followed the tutorial made available by Nvidia on their github page. This tutorial uses the Vulkan C++ bindings, which I am not familiar with, although it provided some good guidelines and directions, I found the helper function specifications more helpful. As they detail more of the raw C functions (despite some inconsistencies). An honorable mention is the official Vulkan specifications.

Descriptors

One of the things that set ray tracing apart from raster graphics is the necessity to have every object in a scene directly accessible on the GPU. For my implementation this meant every model’s index and vertex buffers as well as their textures. This is done using descriptors, Vulkan’s way of connecting GPU based objects to its shader programs.

Acceleration Structures

In order to reduce the amount of ray against geometry collision checks the Ray tracing pipeline does it uses an object called an Acceleration Structure.

These can both hold the optimized data for a given geometry as well as describe the entirety of a scene. In order to fill these purposes, the structure take on two different forms, the Top Level and the Bottom Level Acceleration Structure (TLAS and BLAS).

The BLAS describes one or several sets of geometry, whilst the TLAS describes multiple instances of a single or multiple geometries.

In my engine I have one BLAS for every mesh and one TLAS describing the scene for a given frame. I create and build the BLAS when loading the mesh.

For the TLAS I calculate the max memory it might need dependent on the max number of instances that may be in a scene:

Then during each frame, I rebuild the structure:

An instance is described as a 64-byte aligned struct:

Tracing Rays

Here the custom id is up to the developer and in my case, I used it as an object id with which I indexed the arrays of textures and mesh data.

Tracing Rays

Ray Payloads

When tracing rays a number of different kinds of shaders are used which are conditionally called during the trace. The entry point lies within the Ray Generation shader of which purpose is to define the initial rays.

Depending on whether a ray hits or not, a Hit or Miss shader is called. There are multiple types of hits that can occur but for my implementation I only use the Closest Hit shader.

Any of the conditionally called shaders can in turn generate their own rays, this can be used to create recursive reflections or trace towards a light and test for shadowing.

The work itself is done for every pixel in a specified viewport.

Ray Payloads

Shaders

Payloads are the input and outputs passed between shaders. They can be used both in order to pass forward values and to return them. They are also completely defined by the developer, allowing for a lot of control.

For my payload I first have a color output being the hit surface shaded value. Then I store some attributes of the surface: position, normal and roughness. This is for the ray generation shader in case of new rays needing to be generated. I also store whether a hit has occurred, this is because the same payload is shared across a single trace, i.e. for both the hit and miss shader.

Shaders

The Ray Generation shader starts by determining the current screen coordinates and camera position. Then it un-projects the screen coordinates at the far limit of the view frustum, in order to determine the ray’s direction.

Rays are then shot in a loop. If something is hit and it is reflective (determined in this case by a roughness of 0) it continues with the next iteration, shooting a ray from the hit objects surface.

This continues for a maximum of 8 iterations, for more accurate reflections we could increase this limit.

Once the loop is done the shaded value is stored in an intermediate texture as the ray tracing pipeline does not have access to any framebuffer, due to lying outside of any renderpass.

Inside the Closest Hit shader, the fragment data is calculated using barycentric coordinates.

Then a ray is shot from the surface towards the directional light to check for shadowing.

We’re not interested in any shading that a hit shader would do so we skip it and assume we are shadowed, allowing the miss shader to mark this as false.

Note the payload layout location being 1, a simple bool (shadow) is used as payload rather than the payload struct, this being as we are not interested in the hit surface.

Result

Then a simple lambert shading is performed.

The miss shader is the simplest, outputting only a gradient.

Conclusion

Result

The scene shows lambert shaded models with shadows cast from a directional light as well as reflections.

Conclusion

Ray tracing is a very powerful technique and seeing it become more and more feasible for real time applications is exciting. I enjoy the simplicity of the pipeline provided by Nvidia, needing very little setup. The technique is also something I prefer over the usual raster pipeline, it feels more natural to have access to all parts of the scene.

Core things that are able to bring a rendered image to life, such as shadows and reflections, are much more straightforward.

Not having to render images from multiple perspectives in order to do reflections and shadows in a raster pipeline, feels incredibly freeing. And although it is a heavier technique it puts a lot of weight in what really matters, the final pixels that are presented to the screen.

I would’ve wanted to sit more with PBR but ran out of time. Also ways of integrating ray tracing with rasterization in order to implement skinned animations or make an applications performance more consumer viable is something I would love to explore.