Skip to content

Denser in-memory representation of ShardBlobsToDelete #109848

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

DaveCTurner
Copy link
Contributor

Today each blob to be deleted at the end of a snapshot delete costs us
~80B of heap, and sometimes that can add up to GiBs of temporary heap
usage in total. This commit changes the in-memory representation to use
a compressed stream of pure bytes, which should be more than 4x denser.

Partially mitigates #108278

Today each blob to be deleted at the end of a snapshot delete costs us
~80B of heap, and sometimes that can add up to GiBs of temporary heap
usage in total. This commit changes the in-memory representation to use
a compressed stream of pure bytes, which should be more than 4x denser.

Partially mitigates elastic#108278
@DaveCTurner DaveCTurner added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.15.0 labels Jun 18, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jun 18, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine
Copy link
Collaborator

Hi @DaveCTurner, I've created a changelog YAML for you.

@DaveCTurner

This comment was marked as outdated.

@DaveCTurner DaveCTurner marked this pull request as draft June 18, 2024 10:49
@DaveCTurner DaveCTurner marked this pull request as ready for review June 18, 2024 18:46
@DaveCTurner DaveCTurner requested a review from ywangd June 18, 2024 18:47
shardDeleteResults.add(new ShardSnapshotMetaDeleteResult(indexId, shardId, newGeneration, blobsToDelete));
try {
shardGenerationsBuilder.put(indexId, shardId, newGeneration);
new ShardSnapshotMetaDeleteResult(Objects.requireNonNull(indexId.getId()), shardId, blobsToDelete).writeTo(compressed);
Copy link
Contributor

@mhl-b mhl-b Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand blobsToDelete is already occupy memory we want to optimize. Adding in-memory compression will temporary increase heap usage by 20%? How do you know that blobsToDelete will be GC-ed after compression?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing else retaining it apart from the reference in ShardBlobsToDelete#shardDeleteResults.

Copy link
Member

@pxsalehi pxsalehi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! LGTM.

@DaveCTurner DaveCTurner added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jun 20, 2024
@elasticsearchmachine elasticsearchmachine merged commit ad1f405 into elastic:main Jun 20, 2024
15 checks passed
@DaveCTurner DaveCTurner deleted the 2024/06/18/compressed-ShardBlobsToDelete branch June 20, 2024 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants