Skip to content

Improve performance by using GPU transform feedback for zero initialization of buffers #12744

Open
@RobertBColton

Description

@RobertBColton

Describe the project you are working on

Large scale terrain editing with lots of dense foliage and vegetation using particle systems.

Describe the problem or limitation you are having in your project

Zero initialization of any GPU buffers is slow due to inefficient memory reallocation. This is especially true in particle systems when changing the amount of particles emitted, which causes the system to restart, and get mapped to the CPU twice.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Zero initialization should be done on the GPU in a generic transform feedback utility to avoid stalling the pipeline.

I noticed this while deeply studying the particle systems for other optimizations. I noticed that the generic method maps the buffer to the CPU only to zero initialize, which stalls the pipeline making it much slower than it should be.
https://github.com/godotengine/godot/blob/0ed1c192e87e107f507d41dae00f9859bcc45ef1/drivers/gles3/storage/particles_storage.cpp#L883

  • buffer_allocate_data needs gpu_buffer_allocate_data twin (Compatibility)
  • storage_buffer_create needs gpu_storage_buffer_create twin (Forward+)

This is also doubly bad because after this it will run the process code and updating it a second time anyway. So basically the whole first mapping to the CPU was redundant, but has a huge performance impact.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

GLES3/Compatibility will use a generic transform feedback when we only want to zero init a buffer. Forward+ can do the same or implement a zero init utility function in a compute shader.

If this enhancement will not be used often, can it be worked around with a few lines of script?

No because it's abstracted from the end user and almost any buffer/multimesh or particles will exhibit the problem in some way.

Is there a reason why this should be core and not an add-on in the asset library?

We should definitely do this, it's pretty standard to have async GPU memory init without stalling. It's also very simple for us to implement. This will actually make loading faster for the editor/game/buffers/particles and not just the dynamic changes.

It's a win win all around.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions