-
Notifications
You must be signed in to change notification settings - Fork 25.2k
DRAFT: Consume fieldcaps node response while deserializing it #120010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
DRAFT: Consume fieldcaps node response while deserializing it #120010
Conversation
The API implemented here could definitely use some extra love. But the idea seems to work quite well according to some quick testing and this approach can potentially save a massive amount of memory not just in field caps. Instead of deserializing a full list of per-index responses before getting to processing them, we should consume each response right after deserializing it without any intermediary list. This should be safe to do since we deserialize responses on the same thread that we consume the full response on today. Not only does this save a lot of heap by merging responses directly, it also might save quite a bit of runtime from better cache locality.
This field is only used (by security) for requests, having it in responses is redundant. Also, we have a couple of responses that are singletons/quasi-enums where setting the value needlessly might introduce some strange contention even though it's a plain store. This isn't just a cosmetic change. It makes it clear at compile time that each response instance is exclusively defined by the bytes that it is read from. This makes it easier to reason about the validity of suggested optimizations like elastic#120010
This field is only used (by security) for requests, having it in responses is redundant. Also, we have a couple of responses that are singletons/quasi-enums where setting the value needlessly might introduce some strange contention even though it's a plain store. This isn't just a cosmetic change. It makes it clear at compile time that each response instance is exclusively defined by the bytes that it is read from. This makes it easier to reason about the validity of suggested optimizations like #120010
This field is only used (by security) for requests, having it in responses is redundant. Also, we have a couple of responses that are singletons/quasi-enums where setting the value needlessly might introduce some strange contention even though it's a plain store. This isn't just a cosmetic change. It makes it clear at compile time that each response instance is exclusively defined by the bytes that it is read from. This makes it easier to reason about the validity of suggested optimizations like elastic#120010
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kinda cute but also a bit of a hack :) Especially since the exception handling in TransportResponse#read
is now a little tricky because of having to consume the whole response.
Makes you wonder tho: does TransportResponseHandler
really need separate read
and handleResponse
methods if we just call them in sequence?
That's why I started with
:P It feels like the answer is clearly no, these should not be different methods. In fact I'd love to go even further and make it so the I just figured I'd start with this to illustrate the point :) Also, note that the difference between serialized and deserialized heap use for this kind of message and similar stuff like aggregations is somewhere between 2x and mroe than an order of magnitude. Streaming bytes is nice (and simplifies reasoning about heap consumption and stability) but practically this kind of "pretend streaming" is where most of the win is for messages that can be incrementally reduced on the receiver. Personally, I'd still go for the hack in step 0 and then just do the next step of moving to an actual byte streaming API by selectively short-circuiting the |
Just to illustrate the point of where we can save memory here, the concrete API could be made nicer but functionally this thing seems fine and comes with a measurable performance boost from some quick and dirty benchmarking.
The API implemented here could definitely use some extra love. But the idea seems to work quite well according to some quick testing and this approach can potentially save a massive amount of memory not just in field caps.
Instead of deserializing a full list of per-index responses before getting to processing them, we should consume each response right after deserializing it without any intermediary list. This should be safe to do since we deserialize responses on the same thread that we consume the full response on today.
Not only does this save a lot of heap by merging responses directly, it also might save quite a bit of runtime from better cache locality. Also, having a stateful deserializer allows for a couple helpful tricks in around deduplication across multiple responses for fan-out style APIs as well.