-
-
Notifications
You must be signed in to change notification settings - Fork 22.8k
Core: Add Node::iterate_children
as a fast way to iterate a node's children
#107369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
c38fdc9
to
09f6355
Compare
BenchmarkDebugbefore : 32-35 fps Releasebefore : 55-58 fps We did suspect this might be slower in debug but it also seems slower in release than the previous version (accessing the child cache directly via |
Here's a profile. Notably I'm finding that a lot of small functions aren't getting inlined .. turns out it is likely because of thread guards, which I created an issue for #107393 . First obvious thing is it seems to be branching on |
90c61dd
to
3087b18
Compare
Thank you for profiling! Anyway, I agree that |
It's definitely worth repeating on your end (maybe with different benchmarks, loops). I do a lot of profiling, but not in master, and the scons options are not super familiar to me, so it's possible I've messed something up in the build. It does seem to be inlining some functions (confirming release build) but it's not doing a very good job with this one. |
It seems most of the slowdown is due to the thread guards. We have discussed removing them for this high performance iterator, and I'll check the performance with current master. |
3087b18
to
a327a95
Compare
Node::iter_children
as a fast way to iterate a node's childrenNode::iterate_children
as a fast way to iterate a node's children
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a lot of testing we pinned down the speed loss to the thread guards, and reasoned that they can be removed for this performance function.
Now this seems to perform as fast (or slightly faster) than before in release, and slightly slower in dev builds, but that should be fine.
@Ivorforce also noticed that I'd missed changing one of the slow Godot 4 get_child()
functions to use the span in the original, so that may explain some of the speedup.
Sorry @Ivorforce this will need a slight rebase as I fixed the |
…children, without needing allocations or `get_child`. Adds `Iterable` class to templates.
a327a95
to
175c38d
Compare
SceneTreeFTI
faster access toNode
children #106224This adds a performant way to iterate a node's children.
This was discussed on RocketChat.
Currently, you either iterate
get_child_count
usingget_child
, adding immense overhead, or you iterateget_children()
, which allocates a whole new array. Both are very slow.ChildrenIterator
is technically not needed (Node **
could be iterated instead), but is added to hide the nature of thechildren
storage from the caller. In the future, we may want to change storage without needing to change caller code.Additionally, I'm adding an
Iterable
template, such that children are iterated onnode.iter_children()
instead of onnode
. Ifnode
was iterated, it is not immediately clear that the children are the iteratees. Also, checks would have to be run twice otherwise (forbegin()
andend()
separately), making it unnecessarily slow.