Skip to content

examples/runners/wgpu: avoid holding onto to multiple surfaces at the same time. #181

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 18, 2024

Conversation

eddyb
Copy link
Collaborator

@eddyb eddyb commented Dec 17, 2024

This unbreaks Wayland (I had been using WAYLAND_DISPLAY= cargo run ... for ages instead of investigating it, turns out to have been something very silly).

This is what the bug looked like:

wp_linux_drm_syncobj_manager_v1#63: error 0: surface already exists
thread 'main' panicked at /home/eddy/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-core-22.1.0/src/device/global.rs:1930:25:
internal error: entered unreachable code: Fallback system failed to choose present mode. This is a bug. Mode: AutoVsync, Options: []
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I kept thinking maybe this is a Wayland protocol mismatch or something, but no, Mesa (the opensource GPU driver stack for Linux) has a bug:

We accidentally ended up with this broken scenario, on Wayland:

  • two wgpu::Surfaces for the same wl_surface (the Wayland window object)
    • (technically we had this issue on other platforms but they care less?)
  • both surfaces had .configure(...) called on them
    • AIUI, this is where vkCreateSwapchainKHR gets called
  • the second vkCreateSwapchainKHR fails to acquire an exclusive resource
    • i.e. wp_linux_drm_syncobj_manager_v1#63: error 0: surface already exists
    • however, due to that Mesa bug, this error isn't propagated to the caller
    • wgpu now thinks it has a valid swapchain for the second surface, too
  • the second Vulkan surface/swapchain is, however, partially broken
    • this makes various operations on that Vulkan surface/swapchain fail
    • in particular, wgpu fails to query various surface properties
    • somewhat indirectly, it finally panics failing to find a present mode

With RUST_LOG=wgpu_hal=error I was able to see these VK_ERROR_SURFACE_LOST_KHR
(errors which wgpu largely ignores, leading to 0 supported modes/formats):

[2024-12-16T03:25:00Z ERROR wgpu_hal::vulkan::adapter] get_physical_device_surface_present_modes: ERROR_SURFACE_LOST_KHR
[2024-12-16T03:25:00Z ERROR wgpu_hal::vulkan::adapter] get_physical_device_surface_formats: ERROR_SURFACE_LOST_KHR
[2024-12-16T03:25:00Z ERROR wgpu_hal::vulkan::adapter] get_physical_device_surface_present_modes: ERROR_SURFACE_LOST_KHR
[2024-12-16T03:25:00Z ERROR wgpu_hal::vulkan::adapter] get_physical_device_surface_formats: ERROR_SURFACE_LOST_KHR

(maybe we should run with at least the equivalent of RUST_LOG=error by default? I remember being frustrated that warn!/error! were silent, while working on rustc self-profiling code, which didn't necessarily need nice user-facing diagnostics, but also didn't have a good way to emit them anyway, from the separate measureme library)


While the Mesa bug being fixed wouldn't prevent the second wgpu::Surface from being created (via instance.create_surface(&window)), it could at least fail with a better error (e.g. VK_ERROR_NATIVE_WINDOW_IN_USE_KHR) when trying to create the swapchain, which would make the situation less confusing.

I've mentioned some of these interactions in this wgpu issue:

@eddyb eddyb force-pushed the push-zqulmxwskwvp branch from f84f69b to edd713e Compare December 18, 2024 12:08
@eddyb eddyb added this pull request to the merge queue Dec 18, 2024
Merged via the queue into Rust-GPU:main with commit f069c58 Dec 18, 2024
7 checks passed
@eddyb eddyb deleted the push-zqulmxwskwvp branch December 18, 2024 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants