Camera Driven Rendering #19704
Replies: 5 comments 3 replies
-
Anything that makes compositing more intentional and well thought out has my full support. It interacts with a lot of areas (rendering, ui, picking), and many people don't understand how it works, how powerful it is, or that key features like picking support it. Wearing my picking hat, the thing that we will need to maintain with these changes, is the ability for a pointer to be on some rectangular composited surface, and that the layers composited onto this surface are explicitly ordered and can be hooked into with picking backends. This is what makes it possible today for you to composite cameras on a surface, and to have a single pointer over multiple cameras, with their |
Beta Was this translation helpful? Give feedback.
-
I really like this. I think anything graph-structured like this really begs for editor tooling, so it's great to see this all being driven by the ecs state. |
Beta Was this translation helpful? Give feedback.
-
I see the motivation here from a "graphics internals" developer perspective. The current camera system was built to be a high-ish level API that "just works" for developers, even those who don't have graphics programming experience (obviously "just works" is a matter of perspective and scenario). Before committing to a path like this I'd like to see a proposal for how we will make this approachable and ergonomic for non-graphics-devs for the "high traffic" scenarios like:
As in: what will the user-facing code look like (and how will it behave under the hood). Needing to manually unwire and rewire cameras from a compositor feels very "mid level API" to me, and is not what most developers will expect coming from other engines. These problems seem surmountable. But I'd want solid high level UX sorted out (as in ... competitive with the current impl) prior to committing to this path. |
Beta Was this translation helpful? Give feedback.
-
I really like the direction here. I'm probably down to go this way instead of with my draft, though I think there's still some good ideas there we can adapt. Bunch of assorted thoughts incoming: I think we should separate the physical metaphor parts of IIUC, I agree doing ordering/composition authoritatively from the top down is the way to go. In my draft that takes the form of a tree, but a graph works too. Besides the tooling benefits, that should let us simplify a lot of the implicit ordering stuff we have on cameras now, like the atomics on
Speaking of Regarding encouraging a shift from render graphs to camera graphs, it'd be good to figure out how we want users to interact with each/what we want them to be able to do. Things like: should we constrain what an individual camera graph can/should do so that we can improve their ergonomics? Does the ability for any plugin to mess with/add to engine render graphs still matter as much if they can edit the camera graph instead? |
Beta Was this translation helpful? Give feedback.
-
I'm not much of a render pipeline guy, but I would like to speak from the perspective of the needs for compositing in UI, and hopefully this won't be too tangential. This is coming from someone who has both long experience on the web, and also who has actually written a browser. A subtle yet critical feature of CSS which many people aren't aware of is "implicit compositing". CSS normally composites elements onto a single surface, but will create additional surfaces when it needs to, based on styles. If an element has certain style properties, such as transforms (scaling and rotation), or post-processing effects (such as blur), CSS brings into play a more complex rendering scheme in which the element and its children are first rendered onto a separate surface and then composited onto the parent's surface. This all happens in a way which is transparent to the webdev, although there are various known recipes that can intentionally force this behavior. A very simple use case for this is animating opacity: if you have a dialog or popup (such as a character inventory screen or settings mode) you may want this to "fade in". But individually settings the opacity of the popup's root entity and each child entity gives you the wrong answer, and looks rather ugly. Instead, what you want in this case is to opaquely composite the popup and all it's children, and then animate the opacity of the composited result. Now, I don't think we need to have this sort of thing completely automatic and invisible as it is in CSS, but it would be nice if it were easy to do - perhaps by inserting the right components at some point in the entity hierarchy. Ideally, it should be little effort for the developer to say "this sub-tree of the UI is composited onto a buffer", and that includes having picking work as expected. One challenge with this is Bevy doesn't know how large of a buffer we'll actually need; but I'm OK with the developer having to supply this information as a hint up front, since (for all the use cases I can think of) the value is quite predictable. See also: #6956 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Bevy's rendering APIs have been described as "camera driven" several times, but it's not always clear exactly what that means. In Bevy's rendering system, the "camera" entity has a privileged position and provides the following behaviors/data:
ViewTarget
)RenderLayers
Mixed metaphors
Some discussion in #16248 brings up a number of metaphors for what a camera is:
RenderTarget
is the film, although notes the conceptual imprecision due to the hdr field which hints at the presence of the internal texture, and further introduces the idea that the camera is split into two parts, the camera and the lens.UI presents some other challenges to the metaphor. While UI does have an implicit orthographic projection and is superficially similar to 2d rendering in many ways, it raises questions as to what a UI camera is "looking at" and being rendered on.
Currently, a UI camera is a virtual view ("subview") that is tied to a 2d/3d camera. In this sense, if the lens determines "how" the scene looks, the UI camera is like an additional filter or color gel that is place in front of the camera, i.e. not really a camera at all.
In 15256 when discussing world-space UI, aevyrie argues that the camera metaphor obscures possible implementation, where it could make sense to parent a render surface to some other entity already in worldspace and have things "just work."
Other proposals, such as a hypothetical
CameraFullscreen
, which would be a way to run a render graph with no geometry, i.e. a simple way for users to write fullscreen shaders, continue to stretch the metaphor.The problem with compositing
I'd like to argue that the idea of camera driven rendering is fundamentally sound, but suffers from a critical conceptual ambiguity with respect to what the film medium is. More precisely, the fact that the camera both captures and composites is a significant problem for the API, particularly when using multi-camera setups.
The hidden "internal texture"
Importantly,
RenderTarget
is not the film, it is something more like the print the film is developed onto.CameraOutputMode
is the developer/fixer. The film is, in our current API, not directly exposed to the user.Every
ExtractedView
has aViewTarget
, which contains two logical textures: the "main" texture, which is used as the color attachment for most render passes, and the "out" texture, which is theRenderTarget
typically a swapchain texture. Importantly, the out texture is only used in the final step of the render graph, where the upscaling node blits (i.e. composites) the main texture to the out texture. In other words, the user never sees the main texture itself, which is why it can be said to be "internal."Jasmine notes in #16248 this is particularly confusing because, for example, the
hdr
field onCamera
actually has nothing to do with theRenderTarget
. This has also lead to the proliferation of some more niche components likeCameraMainTextureUsages
that allow configuring the internal texture for other uses in the render graph.The sharp edges of multi-cam
Users consistently run into issues with using multiple cameras. When two cameras share the same HDR and MSAA settings, the renderer will "helpfully" re-use the same cached texture for both of them, including disabling clearing the texture for all cameras after the first. This is potentially a performance win and in many cases results in the behavior users expect, where one camera can easily draw on top of another camera's output, but has a number of unfortunate consequences:
Additionally, this texture is generally not configurable, which poses issues for more niche uses that require different texture formats or would like to use the texture in other contexts.
Proposal: Camera Graph
My proposal is that we embrace camera driven rendering by understanding compositing as another kind of camera. More specifically, I want to argue that we should understand cameras as forming a kind of graph that has both inputs and outputs.
Another way to put this is that a camera should be considered as a logical render pass . This is the
CameraSubGraph
component / the "lens" of the camera. Rather than imagining that the user should configure a single monolithic render graph that accomplishes all their needs in a single camera, we should be encouraging users to create multiple cameras.Making the relationship between cameras itself a graph can help define how textures (and potentially other resources) should flow through rendering at a more coarse grained level and makes creative decisions with respect to compositing explicit. Users who want fine grained control for maximum performance and resource efficiency can still configure a single camera/render graph.
By having cameras accept texture inputs and making compositing a separate step, we can drastically simplify the conceptual model: camera's have film, they can also accept film from another camera to do double-exposure. By making the actual render texture explicit, I think it will be easier to teach patterns for multi-camera rendering. And, while configuring multiple cameras may be a bit of a pain today, this kind of pattern is well suited for asset driven configuration (BSN) and editor tooling.
API Sketch
This isn't intended as a concrete proposal but just a sketch of what an API might look like:
As a logical graph:
Drawbacks
CompositingCamera
in the scene, by default a camera's output goes to that input), but it makes the default case a bit more complicated.CameraSubGraph
and passes storage buffers into the next camera, i.e. making cameras also accept buffers as inputs/outputs.Beta Was this translation helpful? Give feedback.
All reactions