-
-
Notifications
You must be signed in to change notification settings - Fork 35.7k
DataTexture
: Proposal to support partial update
#30184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Nice to keep this idea on the table. So essentially, adding x, y, width, height parameters to [compressed]texSubImage(2D/3D). By the way for your specific case, you could take this a step further by avoiding CPU-to-GPU stalls entirely through the use of Pixel Buffer Objects (PBOs). Adding a new API in Three.js that would be designed to enable asynchronous data transfers by leveraging For example: // Bind the PBO for writing (created on the first upload with `texImage2D` for example)
gl.bindBuffer(gl.PIXEL_UNPACK_BUFFER, pbo);
// Write pixel data to the PBO
const pixelData = new Uint8Array(bufferSize); // Your pixel data
gl.bufferSubData(gl.PIXEL_UNPACK_BUFFER, 0, pixelData);
// Upload the data from the PBO to the texture
gl.bindTexture(gl.TEXTURE_2D, texture);
gl.texSubImage2D(
gl.TEXTURE_2D,
0, // Mipmap level
0, 0, // x and y offsets
textureWidth, // Width of the sub-image
textureHeight, // Height of the sub-image
gl.RGBA, // Format
gl.UNSIGNED_BYTE, // Type
0 // Offset in the PBO
);
// Unbind the PBO
gl.bindBuffer(gl.PIXEL_UNPACK_BUFFER, null); Basically: CPU --> GPU (Direct transfer)
[GPU busy → CPU stalls] PBO: CPU --> PBO (Asynchronous transfer)
PBO --> GPU (GPU reads later when ready)
[CPU free → No stall] Basically, the PBO serves as a staging area in GPU-accessible memory. Pixel data is first copied into the PBO, and the GPU reads from it at a later time. Spamming DataTexture update wouldn't stall the CPU anymore, adding to the command queue of the Pixel Buffer Object instead, and get processed asynchronously. |
I think this would be a great addition - even for any texture. Uploading the whole image for just a single pixel is a waist. A couple comments - I think the signature can be just a single box, right? As in
I think the I'm very supportive of this addition.
This is interesting. Is there a reason to not always do this when uploading DataTextures? It seems like a strict improvement without drawbacks? Does it need a new three.js API? |
This implies an additional array buffer (or 2, one write and another one read to prevent block in lockstep) per DataTexture at initialization: // Create a PBO
const pbo = gl.createBuffer();
// Bind it to PIXEL_UNPACK_BUFFER
gl.bindBuffer(gl.PIXEL_UNPACK_BUFFER, pbo);
// Allocate storage (size in bytes)
const bufferSize = textureWidth * textureHeight * 4; // Assuming RGBA, 4 bytes per pixel
gl.bufferData(gl.PIXEL_UNPACK_BUFFER, bufferSize, gl.STREAM_DRAW);
// Unbind the PBO
gl.bindBuffer(gl.PIXEL_UNPACK_BUFFER, null); So I guess we would need to consider the performance benefit/cost before thinking about adding this to the core. But I agree that it sounds like a direct improvement. |
It's true that this would cause unnecessary resource creation. I think PBO support for faster uploads can be considered in a separate issue - perhaps a flag can be added to enabled / disable this use. Before any work is done on this I think it would be good to get opinions from @Mugen87 or @mrdoob. But I think this is a good change. |
Yes, sorry. I edited the post. @RenaudRohlinger thanks for the information! |
Just to let you know, Remark: I remember that WebGL calls don't directly invoke the OS graphics API. Instead, they are added to a ring buffer that is processed by a separate thread in the browser's renderer thread. So, the bool Renderer11::supportsFastCopyBufferToTexture(GLenum internalFormat) const
{
const gl::InternalFormat &internalFormatInfo = gl::GetSizedInternalFormatInfo(internalFormat);
const d3d11::Format &d3d11FormatInfo =
d3d11::Format::Get(internalFormat, mRenderer11DeviceCaps);
// sRGB formats do not work with D3D11 buffer SRVs
if (internalFormatInfo.colorEncoding == GL_SRGB)
{
return false;
}
// We cannot support direct copies to non-color-renderable formats
if (d3d11FormatInfo.rtvFormat == DXGI_FORMAT_UNKNOWN)
{
return false;
}
// We skip all 3-channel formats since sometimes format support is missing
if (internalFormatInfo.componentCount == 3)
{
return false;
}
// We don't support formats which we can't represent without conversion
if (d3d11FormatInfo.format().glInternalFormat != internalFormat)
{
return false;
}
// Buffer SRV creation for this format was not working on Windows 10.
if (d3d11FormatInfo.texFormat == DXGI_FORMAT_B5G5R5A1_UNORM)
{
return false;
}
// This format is not supported as a buffer SRV.
if (d3d11FormatInfo.texFormat == DXGI_FORMAT_A8_UNORM)
{
return false;
}
return true;
} |
@agargaro I can't assess to what extent that's an option for you. With WebGPU, you can use SBOs instead of DataTextures, and they don't have the limitation you mentioned. I use them extensively. If it has to be WebGL, I can't help, but if you want to use SBOs with three.webgpu.js, I can. I recently converted my ocean repo to SBOs. They make the code significantly leaner and much more efficient. With SBOs, you can read and write without any problems. |
Hi @Spiri0, thank you 😄 Currently I am still using But for those who are still using If someone would like to provide some technical details, I could try to do a PR for only the Can I write to you privately, (maybe on the forum?) to ask you some information about what you were suggesting? |
Hey, I ran into a similar need. I've been working on an LOD System using // enable lod0 on mesh 42 (assuming 3 lod levels)
batch0.setVisibleAt(42, true);
batch1.setVisibleAt(42, false);
batch2.setVisibleAt(42, false); But it turns out, that this actually hurts performance, compared to having no LOD at all, as the constant |
Use a large dst texture for the shader and a small src texture that only contains the data that needs to be updated in the dst. Then use copyTextureToTexture to transfer the data from the src to the dst. This works wonderfully and is very efficient for exactly what you have in mind. I speak from experience. This will update only the necessary parts of the dst texture, exactly as you want. My LOD systems run at 120 fps. However, I don't use batched meshes for them because I don't consider them suitable for the intended purpose. @RenaudRohlinger Do you remember that? You created this example at my request. It does exactly what this topic is about, and it works great. "replace parts of an existing texture with all data of another texture" https://threejs.org/examples/?q=part#webgpu_textures_partialupdate This works the same with WebGL |
Hey @Spiri0, thanks for sharing your approach! Will definitely keep it in mind. The reason I'm hoping to use |
If your approach with batched meshes works well for you, then it's a good solution. To each his own. The technique of loading data into a chunk dataTexture, updating it to send the data to the GPU, and then copy it to the target texture using copyTextureToTexture is exactly what you're looking for. I'm glad if it helps. @RenaudRohlinger @Mugen87 The example of renauldRohlinger is essentially the answer to this topic. This works equally well in WebGL and WebGPU |
That doesn't work as well as you think it does. Think of how you commit memory to the GPU in the first place and then use it. Three.js forces a stall in the worst case since it has no scheduler to juggle WIP memory yet creates and uses memory in the same frame. This gets dangerous as memory throughput increases (and/or frequency in dynamic cases). Thereafter, a GPU <-> GPU copy is fast, but it's the worst case we are concerned about, which does not improve otherwise. I already started work on partial updates to textures and applied to |
The projects I use it in work very well and are very complex projects. But if you have a more advanced technique, then I welcome it. |
If you think your projects could lend to future testing, it would be greatly appreciated if you could try there as work comes in. Otherwise, you should not expect regression here, but rather improvement in all renderers. For fully dynamic use cases like in games, unfortunately there is no perfect solution, but this is already a big step. I think a larger discussion is warranted if three.js wants to open up to inter-frame lifecycle for smooth streaming at the expense of single-frame determinism, but this would be a prerequisite. |
I understand tht there are workarounds but my feeling is that this issue should be about the ergonomics of updating a portion of a texture (as you can already do with geometry data to a meaningful benefit) and how BatchedMesh can (and should) benefit from any performance gains for "free". Due to the architecture of three.js BatchedMesh cannot use I know @agargaro has done a lot of work and testing in this area, already, and it would be great if we can get some more concrete use cases and numbers relating to the benefits of a feature like this. I know it can be hard to these kinds of upload times but a small demo with showing framerate differences when updating one pixel in the matrix / color texture with and without partial texture updates would be great. |
I already have work for this in #30998 and applied to BatchedMesh. Some prior art would also be spite/THREE.UpdatableTexture, but note I am using WebGL2 features. I've left it as a draft until I figure out a good way to leverage it for BatchedMesh's indirection texture. The slowdown reported there when methods like It would be worthwhile if we could arrive at a solution that benefits @Ctrlmonster's case, where they want to use Regarding the API, it would have to be quite different to support a third level for 3D textures, but I struggle to see how this is useful or desirable in practice. When are you ever going to update a LUT from host memory? It should be compressed anyway, or better to use a fitted function for mobile. I have mirrored the API from BufferAttribute and implemented it for 2D textures only. |
Hey, just wanted to share that I've also done some more tests with a lot of help from @agargaro. We were wondering why the frametime takes so long, even though the texture was getting updated each frame anyway (regardless of So my preliminary conclusion (take this with a grain of salt) was that on low-end devices (I'm testing on a Thinkpad T480s) the texture uploads due to visibility changes really start adding up, once you add multiple BatchedMeshes to the scene (around 15-20 in my case). My LOD case should probably be re-organized so that each LOD geometry gets stored within a single I'm a bit short on time right now, but maybe I can make a demo for this in the coming days, as getting some harder numbers to look at would certainly be helpful. I think @agargaro has already seen performance gains by using their For |
I think it's fair to say that even just addressing color and matrix textures should be a meaningful improvement. I don't necessarily want to block on what would be a good API and performance boost because it's not perfect. That said it would also be good to have some concrete numbers using something like framerate on the improvements in basic cases (eg no sorting) to see how this plays out. I'm curious to how the post-processing range "compression" impacts update time per frame in #30998 compared to the data upload-time benefits. But here are some of the different problems that are clearly in-optimal and I think can be tackled separately:
This would be great.
You can easily swap geometry models an instance is rendering why by using the setGeometryIdAt function. Regarding separate materials and textures - this should hopefully not be considered a long term issue. @agargaro and I have done a number of multi-material demos with BatchedMesh (see here), which should only get easier to work with with node materials. |
I split away the changes to That should allow us to continue with no. 1 and no. 2 as we like. I have had help from @agargaro, and they expressed interest in continuing this work. The review in #30998 (comment) might open the door to supporting 3D textures. I want to be careful here since this API already exists in |
Description
Hi!
I am using
DataTexture
quite a lot to handle data withBatchedMesh
andInstancedMesh2
(InstancedMesh
+ indirection).In my case, I would like to update the color of only one instance (on the mouse over), but send the gpu the whole texture is expensive because it's very large.
I tried using the WebGLRenderer.copyTextureToTexture method but it doesn't work when
src
anddest
are the same texture (I might open a separate bug for this).Anyway, this method is useless if
BatchedMesh
automatically updates.needsUpdate
flag which will make the whole texture update anyway.It would be fantastic to have a partial update system like
BufferAttribute
.I know that it's an important change, but if you want I can help.
Thank you for all the work you do. 😄
Solution
Implementing an
addUpdateRange
method similar to that ofBufferAttribute
..addUpdateRange ( region : Box2 ) : this
Alternatives
Fix and use WebGLRenderer.copyTextureToTexture, but we should remove.needsUpdate = true
fromBatchedMesh
?Additional context
No response
The text was updated successfully, but these errors were encountered: