Journey into rust #2: Compute Shaders
I realized after my first post of this series that it’s not just a journey into rust but also OpenGL. I’ve used other Graphics API’s before but never actually got my hands dirty into OpenGL. Someone on the rust user forums (they are awesome, go check it out!) suggested using compute shaders instead. At the time I had never used compute shaders for a project so I decided to take some time to refactor the program to use a compute shader. This post is a follow up on that remark and will explore the possibilities of using a rust together with OpenGL to run compute shaders.
Below is a small video of the end result with using compute shaders. There’s colors and the cells have a lifetime!
What is a compute shader?
Remember that graphics pipeline I briefly mentioned in the previous post? A compute shader is a shader stage of that pipeline. This stage can be used for computing any information you want really, it can do rendering but the main use for it would be to compute data for other tasks that later get used by the rendering. It’s separate from any of the other stages and needs to be explicitly run by using a gl function call.
More information can be found at compute shader.
The reason a compute shader can be useful is that it’s independent of any drawing commands. With the fragment shader approach we are limited to using a Image/Texture to sample and render to. A compute shader takes arbitrary inputs and can also have arbitrary outputs.
A basic compute shader would look like this:
This is just an example and won’t do anything meaningful except for writing out a black texture.
Notice the second line layout(local_size_x = 1 ,local_size_y = 1) in;
. This is the part where I would explain the abstract “space” compute shaders run in but instead I’m going to be lazy and link to a wiki article for you to read.
Sending/Receiving data from the Compute Shader
Sending/Receiving data to and from a compute shader is quite similar as any other shader. You can still use uniforms to send data to the shader. The biggest difference is that compute shaders can write data to buffers and textures directly.
Using textures
The first thing I tried was to implement the same logic but in a compute shader. For this I had to change a couple of things. Instead of returning a color value the compute shaders has to directly write to a image2D
. The image2D
can be defined as a uniform like this:
Remark the layout(rgba8, binding = 0)
here. These are called layout qualifiers and affect where the storage of a certain variable comes from. More info about this can be found here.
All it boils down to for us is:
rgba8
: The image format. The format the resource will be converted into for read and write operations.binding = 0
: This binding point . This sets the uniform location to 0. This is important when trying to bind resources to a shader program.
Now to read from a texture we will use texelFetch
and imageStore
. I won’t be using these in the final shader as we won’t be using textures anymore but I thought it was worth mentioning these as they can be important when you are using textures.
A simplified version of the compute shader would look like this now:
Using Shader Storage Buffer Objects (SSBOs)
Using textures is a bit boring, we want to be part of the cool kid SSBO gang.
A Shader Storage buffer Object (SSBO) is just like a Uniform Buffer Object (UBO) except for the fact that you can write to SSBOs. Another great link to a wiki article:
I have abstracted this away into something called a StructuredBuffer<T>
. T
is the type of the structure we will be using.
When creating this I ran into a couple of oddities. I wanted a buffer that was strongly typed but when you don’t use your T the rust compiler complains. To get around this it suggests using std::marker::PhantomData<T>
which works but seems weird and annoying to me coming from C++. As this structured is not supposed to be access from the CPU I won’t be keeping track of any CPU data. Once the data is submitted to the GPU we don’t care anymore as all the calculating happens on the GPU.
In the game of life we want to fill our structured buffer with a predefined state. This is quite easy in OpenGL and works as following:
Assembling the blocks
Now that we’ve got our final building blocks we can put this all together into our actual program. First thing I changed was our shader programs, instead of having 2 programs that both had a vertex and fragment shader I’ve now got 1 program that only does compute shader things and 1 that takes in the output of this compute shader to then proceed and render pretty colours to the screen.
Setting up our CPU data
Our cell data is described like this in rust. There’s a similar definition in GLSL (see below) that matches the same elements.
I added the lifetime
and creation
parameters so we can use them to nicely visualize the simulation over time. Eventually I think I will try to implement a rudimentary form of GPU fluid simulation in 2D of course.
The generate_field
method returns a Vec<CellData>
object which we then use to create a StructureBuffer<CellData>
. The current state will be initialized to 0 because we will be overwriting the first state anyways.
For every cycle we need to dispatch
a compute call to the GPU.
Hint: Not sure if your shader gets the right inputs? Check renderdoc in the “Compute shader” tab. It allows you to view to contents of each SSBO that the shader receives!
Notice how for the dispatch_compute
call divide our field by 8. This is because the space in our compute shader (local_size_x and local_size_y) will process in groups that are sized 8x8x1.
See OpenGL compute shader dispatching for more info.
Also see Compute shader limitations. In this sample program I do not do any checking of the limits as to keep the code simpler.
The Shader
Now that we’ve set up our CPU data we need to read this data in our compute shader.
Below code snippet is the first part of my shader that defines:
- a copy of our data structures (needed because our shader doesn’t know about rust structures).
- Other uniform inputs such as
u_time
,u_dt
andu_field_size
- These are used to determine the cell it’s state and rendering in later stages.
- an
OutputData
andInputData
interface block which as 1 variable array ofCellData
’s- This is where the magic happens.
- Remark: Both input and output data blocks have a “shared” memory layout. This makes sure that all our variables are marked “active” and not ignored
The actual compute happens in the main function of shader.compute
. It’s not very different as before, instead of reading from textures and storing the values we now read directly from variable arrays.
I also had to adjust my actual fragment shader to render out using the new data. I won’t be posting that code here but you can see this on the github repo!
The final result looks like this: