Skip to content

Remove reference counting from InboundMessage and make it Releasable #126138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Apr 2, 2025

There is no actual need t oreference count InboundMessage instances, their lifecycle is completely linear so we can simplify it away. This saves a little work directly but more importantly, it enables more eager releasing of the underlying buffers in a follow-up.

This is pretty much the mirror issue to
elastic#122981, solving the same
problem on the inbound side. We should also eagerly release buffers
after deserialization. Especially for bulk requests this could become
quite helpful.
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Apr 2, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@@ -351,7 +352,7 @@ private void handleHandshakeRequest(TcpChannel channel, InboundMessage message)
true,
Releasables.assertOnce(message.takeBreakerReleaseControl())
);
try {
try (message) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly annoying to spread the release logic across so many spots but in the end that is precisely what we need/want, release right where we deserialize. (If we want this cleaner we need to deserialise in fewer spots I guess?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is the same thing as refcount-negative methods: we need to find a way to make the fact that the method closes one of its arguments explicit at the call site.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or alternatively (and I'm starting to think this is more practical), any method that consumes an object like that must either pass it on elsewhere (and do so exactly once) or release it.
Just documenting that something closes the argument still leaves a lot of complexity around having to check that the last receiver of the instances is one of those closing-enabled methods and that other methods don't start holding on to instances doesn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trouble is I don't think this is (or could be made) true of every method that takes an argument which represents nontrivial resources. We need to support both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... still leaves a lot of complexity around having to check...

sure, it's not a fully automated answer (i.e. a proper borrow-checker) but at least it reduces those checks to method-local things rather than having to reason about things up and down the stack some arbitrary distance.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit with the comments at the call sites that I'd like to see, wdyt?

} finally {
aggregated.decRef();
if (aggregated != null) {
// TODO doesn't messageHandler auto-close always?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at messageHandler (i.e. TcpTransport#inboundMessage) I think we close on all paths already, and catch everything anyway, so this seems redundant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right ... let me fix that right up :)

@@ -260,6 +281,7 @@ private <T extends TransportRequest> void handleRequest(TcpChannel channel, Inbo
final T request;
try {
request = reg.newRequest(stream);
message.close(); // eager release message to save heap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we drop this one line from this PR and do it in an immediate follow-up? That way we have one commit that replaces refcounting with one-shot closing without changing semantics and then another that reduces the lifecycle.

Copy link
Member Author

@original-brownbear original-brownbear Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ makes sense dropped it here

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with suitable fixes to the title and OP.

@original-brownbear original-brownbear added the auto-backport Automatically create backport pull requests when merged label Apr 17, 2025
@original-brownbear original-brownbear changed the title Release InboundMessage directly after deserialization Remove reference counting from InboundMessage and make it Releasable Apr 17, 2025
@original-brownbear
Copy link
Member Author

Thanks David! Follow-up incoming :)!

@original-brownbear original-brownbear merged commit 149ff93 into elastic:main Apr 17, 2025
16 of 17 checks passed
@original-brownbear original-brownbear deleted the release-inbound-quicker branch April 17, 2025 13:10
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 126138

original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 17, 2025
Follow-up to elastic#126138.
We can now release requst bytes directly after deserialization.
Also, request instances need not go through a ref-counting cycle when forking,
removing some contention from transport threads.
@original-brownbear
Copy link
Member Author

Follow-up in #126998 :)

original-brownbear added a commit that referenced this pull request Apr 17, 2025
Follow-up to #126138.
We can now release requst bytes directly after deserialization.
Also, request instances need not go through a ref-counting cycle when forking,
removing some contention from transport threads.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending :Distributed Coordination/Network Http and internode communication implementations >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants