Skip to content

Optimise shared-blob-cache evictions #126581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
c86859c
Evict from the shared blob cache asynchronously
nicktindall Apr 10, 2025
c143dba
Only evict from shared cache when index is partial (SharedSnapshotInd…
nicktindall Apr 10, 2025
a62ac00
Update docs/changelog/126581.yaml
nicktindall Apr 10, 2025
6ae1e7c
Merge branch 'main' into evict_from_the_shared_blob_cache_asynchronously
nicktindall Apr 10, 2025
ff3a25d
Fix changelog
nicktindall Apr 10, 2025
2ef16c9
Update x-pack/plugin/blob-cache/src/main/java/org/elasticsearch/blobc…
nicktindall Apr 14, 2025
d3ce506
Fix indenting
nicktindall Apr 14, 2025
bd35686
evictionsRunner -> asyncEvictionsRunner
nicktindall Apr 14, 2025
83c1bda
Merge branch 'main' into evict_from_the_shared_blob_cache_asynchronously
nicktindall Apr 15, 2025
632afbc
Only evict asynchronously for shards we know are not coming back
nicktindall Apr 14, 2025
3274c1c
Merge remote-tracking branch 'origin/main' into evict_from_the_shared…
nicktindall Apr 15, 2025
253dba1
Merge remote-tracking branch 'origin/main' into evict_from_the_shared…
nicktindall Apr 22, 2025
f035f25
Merge branch 'main' into evict_from_the_shared_blob_cache_asynchronously
nicktindall Apr 23, 2025
7ac3220
Propagate IndexRemovalReason to deletion listeners
nicktindall Apr 23, 2025
8e18644
Fix naming (reasonMessage/reason)
nicktindall Apr 23, 2025
410fb35
Fix naming (reasonText/reason)
nicktindall Apr 23, 2025
2372056
Naming
nicktindall Apr 23, 2025
8c91b45
[CI] Auto commit changes from spotless
elasticsearchmachine Apr 23, 2025
87d1ba4
Naming/javadoc
nicktindall Apr 23, 2025
ea43b2d
randomReason()
nicktindall Apr 23, 2025
c6e7a05
Don't evict shards when IndexRemovalReason is FAILURE
nicktindall Apr 23, 2025
7eebc42
javadoc/naming
nicktindall Apr 23, 2025
250df4c
Merge remote-tracking branch 'origin/main' into evict_from_the_shared…
nicktindall May 20, 2025
d3cd806
Make IndexRemovalReason a top-level enum for sharing
nicktindall May 21, 2025
5eabc0f
Fix eviction logic
nicktindall May 21, 2025
53ad877
Comment
nicktindall May 21, 2025
fa17c66
Fix eviction logic
nicktindall May 21, 2025
69e748f
Improve change summary
nicktindall May 21, 2025
f498216
Merge branch 'main' into evict_from_the_shared_blob_cache_asynchronously
nicktindall May 21, 2025
185b390
Merge branch 'main' into evict_from_the_shared_blob_cache_asynchronously
nicktindall May 22, 2025
c2f3b0f
Add tests
nicktindall May 23, 2025
63f4b1f
Work with any number of nodes
nicktindall May 23, 2025
cd09f7d
Randomise number of docs
nicktindall May 23, 2025
69eb2e5
Merge branch 'main' into evict_from_the_shared_blob_cache_asynchronously
nicktindall May 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/126581.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 126581
summary: Evict from the shared blob cache asynchronously
area: Snapshot/Restore
type: enhancement
issues: []
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import org.elasticsearch.index.IndexSettings;
import org.elasticsearch.index.shard.ShardId;
import org.elasticsearch.indices.IndicesService;
import org.elasticsearch.indices.cluster.IndicesClusterStateService.AllocatedIndices.IndexRemovalReason;
import org.elasticsearch.test.ESIntegTestCase;
import org.elasticsearch.test.junit.annotations.TestLogging;

Expand Down Expand Up @@ -345,12 +346,22 @@ public static class IndexFoldersDeletionListenerPlugin extends Plugin implements
public List<IndexFoldersDeletionListener> getIndexFoldersDeletionListeners() {
return List.of(new IndexFoldersDeletionListener() {
@Override
public void beforeIndexFoldersDeleted(Index index, IndexSettings indexSettings, Path[] indexPaths) {
public void beforeIndexFoldersDeleted(
Index index,
IndexSettings indexSettings,
Path[] indexPaths,
IndexRemovalReason reason
) {
deletedIndices.add(index);
}

@Override
public void beforeShardFoldersDeleted(ShardId shardId, IndexSettings indexSettings, Path[] shardPaths) {
public void beforeShardFoldersDeleted(
ShardId shardId,
IndexSettings indexSettings,
Path[] shardPaths,
IndexRemovalReason reason
) {
deletedShards.computeIfAbsent(shardId.getIndex(), i -> Collections.synchronizedList(new ArrayList<>())).add(shardId);
}
});
Expand Down
16 changes: 11 additions & 5 deletions server/src/main/java/org/elasticsearch/index/IndexService.java
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@
import org.elasticsearch.index.translog.Translog;
import org.elasticsearch.indices.breaker.CircuitBreakerService;
import org.elasticsearch.indices.cluster.IndicesClusterStateService;
import org.elasticsearch.indices.cluster.IndicesClusterStateService.AllocatedIndices.IndexRemovalReason;
import org.elasticsearch.indices.fielddata.cache.IndicesFieldDataCache;
import org.elasticsearch.indices.recovery.RecoveryState;
import org.elasticsearch.plugins.IndexStorePlugin;
Expand Down Expand Up @@ -494,7 +495,12 @@ public synchronized IndexShard createShard(
nodeEnv,
lock,
this.indexSettings,
shardPaths -> indexFoldersDeletionListener.beforeShardFoldersDeleted(shardId, this.indexSettings, shardPaths)
shardPaths -> indexFoldersDeletionListener.beforeShardFoldersDeleted(
shardId,
this.indexSettings,
shardPaths,
IndexRemovalReason.FAILURE
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a mis-categorisation as FAILURE. The javadoc seems to suggest it's deleting remnants of a different shard rather than the shard being created, due to a name collision. So we're deleting not because the shard failed to start, but to clear old state from a shard that used to have the same name as the one being started.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's OK to use FAILURE, but maybe worth a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the more I look at it I think it's vanilla enough to use without comment. It's just clearing some bad state which is the same as all the other cases. The fact that bad state came from an earlier event is kind of irrelevant.

);
path = ShardPath.loadShardPath(logger, nodeEnv, shardId, this.indexSettings.customDataPath());
} catch (Exception inner) {
Expand Down Expand Up @@ -704,11 +710,11 @@ private void onShardClose(ShardLock lock) {
try {
eventListener.beforeIndexShardDeleted(lock.getShardId(), indexSettings.getSettings());
} finally {
shardStoreDeleter.deleteShardStore("delete index", lock, indexSettings);
shardStoreDeleter.deleteShardStore("delete index", lock, indexSettings, IndexRemovalReason.DELETED);
eventListener.afterIndexShardDeleted(lock.getShardId(), indexSettings.getSettings());
}
} catch (IOException e) {
shardStoreDeleter.addPendingDelete(lock.getShardId(), indexSettings);
shardStoreDeleter.addPendingDelete(lock.getShardId(), indexSettings, IndexRemovalReason.DELETED);
logger.debug(() -> "[" + lock.getShardId().id() + "] failed to delete shard content - scheduled a retry", e);
}
}
Expand Down Expand Up @@ -1062,9 +1068,9 @@ public static Function<String, String> dateMathExpressionResolverAt(long instant
}

public interface ShardStoreDeleter {
void deleteShardStore(String reason, ShardLock lock, IndexSettings indexSettings) throws IOException;
void deleteShardStore(String reasonText, ShardLock lock, IndexSettings indexSettings, IndexRemovalReason reason) throws IOException;

void addPendingDelete(ShardId shardId, IndexSettings indexSettings);
void addPendingDelete(ShardId shardId, IndexSettings indexSettings, IndexRemovalReason reason);
}

public final EngineFactory getEngineFactory() {
Expand Down
72 changes: 42 additions & 30 deletions server/src/main/java/org/elasticsearch/indices/IndicesService.java
Original file line number Diff line number Diff line change
Expand Up @@ -1005,7 +1005,7 @@ public void removeIndex(
listener.afterIndexRemoved(indexService.index(), indexSettings, reason);
if (reason == IndexRemovalReason.DELETED) {
// now we are done - try to wipe data on disk if possible
deleteIndexStore(extraInfo, indexService.index(), indexSettings);
deleteIndexStore(extraInfo, indexService.index(), indexSettings, reason);
}
}));
});
Expand Down Expand Up @@ -1077,7 +1077,7 @@ public void deleteUnassignedIndex(String reason, IndexMetadata oldIndexMetadata,
+ "]"
);
}
deleteIndexStore(reason, oldIndexMetadata);
deleteIndexStore(reason, oldIndexMetadata, IndexRemovalReason.DELETED);
} catch (Exception e) {
logger.warn(() -> format("[%s] failed to delete unassigned index (reason [%s])", oldIndexMetadata.getIndex(), reason), e);
}
Expand All @@ -1090,7 +1090,7 @@ public void deleteUnassignedIndex(String reason, IndexMetadata oldIndexMetadata,
*
* Package private for testing
*/
void deleteIndexStore(String reason, IndexMetadata metadata) throws IOException {
void deleteIndexStore(String reasonText, IndexMetadata metadata, IndexRemovalReason reason) throws IOException {
if (nodeEnv.hasNodeFile()) {
synchronized (this) {
Index index = metadata.getIndex();
Expand All @@ -1108,33 +1108,35 @@ void deleteIndexStore(String reason, IndexMetadata metadata) throws IOException
}
}
final IndexSettings indexSettings = buildIndexSettings(metadata);
deleteIndexStore(reason, indexSettings.getIndex(), indexSettings);
deleteIndexStore(reasonText, indexSettings.getIndex(), indexSettings, reason);
}
}

private void deleteIndexStore(String reason, Index index, IndexSettings indexSettings) throws IOException {
deleteIndexStoreIfDeletionAllowed(reason, index, indexSettings, DEFAULT_INDEX_DELETION_PREDICATE);
private void deleteIndexStore(String reasonText, Index index, IndexSettings indexSettings, IndexRemovalReason reason)
throws IOException {
deleteIndexStoreIfDeletionAllowed(reasonText, index, indexSettings, DEFAULT_INDEX_DELETION_PREDICATE, reason);
}

private void deleteIndexStoreIfDeletionAllowed(
final String reason,
final String reasonText,
final Index index,
final IndexSettings indexSettings,
final IndexDeletionAllowedPredicate predicate
final IndexDeletionAllowedPredicate predicate,
final IndexRemovalReason reason
) throws IOException {
boolean success = false;
try {
// we are trying to delete the index store here - not a big deal if the lock can't be obtained
// the store metadata gets wiped anyway even without the lock this is just best effort since
// every shards deletes its content under the shard lock it owns.
logger.debug("{} deleting index store reason [{}]", index, reason);
logger.debug("{} deleting index store reason [{}]", index, reasonText);
if (predicate.apply(index, indexSettings)) {
// its safe to delete all index metadata and shard data
nodeEnv.deleteIndexDirectorySafe(
index,
0,
indexSettings,
paths -> indexFoldersDeletionListeners.beforeIndexFoldersDeleted(index, indexSettings, paths)
paths -> indexFoldersDeletionListeners.beforeIndexFoldersDeleted(index, indexSettings, paths, reason)
);
}
success = true;
Expand All @@ -1144,7 +1146,7 @@ private void deleteIndexStoreIfDeletionAllowed(
logger.warn(() -> format("%s failed to delete index", index), ex);
} finally {
if (success == false) {
addPendingDelete(index, indexSettings);
addPendingDelete(index, indexSettings, reason);
}
// this is a pure protection to make sure this index doesn't get re-imported as a dangling index.
// we should in the future rather write a tombstone rather than wiping the metadata.
Expand All @@ -1154,19 +1156,20 @@ private void deleteIndexStoreIfDeletionAllowed(

/**
* Deletes the shard with an already acquired shard lock.
* @param reason the reason for the shard deletion
* @param reasonText the reason for the shard deletion
* @param lock the lock of the shard to delete
* @param indexSettings the shards index settings.
* @throws IOException if an IOException occurs
*/
@Override
public void deleteShardStore(String reason, ShardLock lock, IndexSettings indexSettings) throws IOException {
public void deleteShardStore(String reasonText, ShardLock lock, IndexSettings indexSettings, IndexRemovalReason reason)
throws IOException {
ShardId shardId = lock.getShardId();
logger.trace("{} deleting shard reason [{}]", shardId, reason);
logger.trace("{} deleting shard reason [{}]", shardId, reasonText);
nodeEnv.deleteShardDirectoryUnderLock(
lock,
indexSettings,
paths -> indexFoldersDeletionListeners.beforeShardFoldersDeleted(shardId, indexSettings, paths)
paths -> indexFoldersDeletionListeners.beforeShardFoldersDeleted(shardId, indexSettings, paths, reason)
);
}

Expand All @@ -1178,13 +1181,14 @@ public void deleteShardStore(String reason, ShardLock lock, IndexSettings indexS
* On data nodes, if the deleted shard is the last shard folder in its index, the method will attempt to remove
* the index folder as well.
*
* @param reason the reason for the shard deletion
* @param reasonText the reason for the shard deletion
* @param shardId the shards ID to delete
* @param clusterState . This is required to access the indexes settings etc.
* @param reason The reason for the deletion (as an enum)
* @throws IOException if an IOException occurs
*/
public void deleteShardStore(String reason, ShardId shardId, ClusterState clusterState) throws IOException,
ShardLockObtainFailedException {
public void deleteShardStore(String reasonText, ShardId shardId, ClusterState clusterState, IndexRemovalReason reason)
throws IOException, ShardLockObtainFailedException {
final IndexMetadata metadata = clusterState.getMetadata().getProject().indices().get(shardId.getIndexName());

final IndexSettings indexSettings = buildIndexSettings(metadata);
Expand All @@ -1195,15 +1199,15 @@ public void deleteShardStore(String reason, ShardId shardId, ClusterState cluste
nodeEnv.deleteShardDirectorySafe(
shardId,
indexSettings,
paths -> indexFoldersDeletionListeners.beforeShardFoldersDeleted(shardId, indexSettings, paths)
paths -> indexFoldersDeletionListeners.beforeShardFoldersDeleted(shardId, indexSettings, paths, reason)
);
logger.debug("{} deleted shard reason [{}]", shardId, reason);
logger.debug("{} deleted shard reason [{}]", shardId, reasonText);

if (canDeleteIndexContents(shardId.getIndex())) {
if (nodeEnv.findAllShardIds(shardId.getIndex()).isEmpty()) {
try {
// note that deleteIndexStore have more safety checks and may throw an exception if index was concurrently created.
deleteIndexStore("no longer used", metadata);
deleteIndexStore("no longer used", metadata, reason);
} catch (Exception e) {
// wrap the exception to indicate we already deleted the shard
throw new ElasticsearchException("failed to delete unused index after deleting its last shard (" + shardId + ")", e);
Expand Down Expand Up @@ -1259,7 +1263,7 @@ public IndexMetadata verifyIndexIsDeleted(final Index index, final ClusterState
}
final IndexSettings indexSettings = buildIndexSettings(metadata);
try {
deleteIndexStoreIfDeletionAllowed("stale deleted index", index, indexSettings, ALWAYS_TRUE);
deleteIndexStoreIfDeletionAllowed("stale deleted index", index, indexSettings, ALWAYS_TRUE, IndexRemovalReason.DELETED);
} catch (Exception e) {
// we just warn about the exception here because if deleteIndexStoreIfDeletionAllowed
// throws an exception, it gets added to the list of pending deletes to be tried again
Expand Down Expand Up @@ -1317,22 +1321,22 @@ private IndexSettings buildIndexSettings(IndexMetadata metadata) {
* Adds a pending delete for the given index shard.
*/
@Override
public void addPendingDelete(ShardId shardId, IndexSettings settings) {
public void addPendingDelete(ShardId shardId, IndexSettings settings, IndexRemovalReason reason) {
if (shardId == null) {
throw new IllegalArgumentException("shardId must not be null");
}
if (settings == null) {
throw new IllegalArgumentException("settings must not be null");
}
PendingDelete pendingDelete = new PendingDelete(shardId, settings);
PendingDelete pendingDelete = new PendingDelete(shardId, settings, reason);
addPendingDelete(shardId.getIndex(), pendingDelete);
}

/**
* Adds a pending delete for the given index.
*/
public void addPendingDelete(Index index, IndexSettings settings) {
PendingDelete pendingDelete = new PendingDelete(index, settings);
public void addPendingDelete(Index index, IndexSettings settings, IndexRemovalReason reason) {
PendingDelete pendingDelete = new PendingDelete(index, settings, reason);
addPendingDelete(index, pendingDelete);
}

Expand All @@ -1348,25 +1352,28 @@ private static final class PendingDelete implements Comparable<PendingDelete> {
final int shardId;
final IndexSettings settings;
final boolean deleteIndex;
final IndexRemovalReason reason;

/**
* Creates a new pending delete of an index
*/
PendingDelete(ShardId shardId, IndexSettings settings) {
PendingDelete(ShardId shardId, IndexSettings settings, IndexRemovalReason reason) {
this.index = shardId.getIndex();
this.shardId = shardId.getId();
this.settings = settings;
this.deleteIndex = false;
this.reason = reason;
}

/**
* Creates a new pending delete of a shard
*/
PendingDelete(Index index, IndexSettings settings) {
PendingDelete(Index index, IndexSettings settings, IndexRemovalReason reason) {
this.index = index;
this.shardId = -1;
this.settings = settings;
this.deleteIndex = true;
this.reason = reason;
}

@Override
Expand Down Expand Up @@ -1430,7 +1437,12 @@ public void processPendingDeletes(Index index, IndexSettings indexSettings, Time
nodeEnv.deleteIndexDirectoryUnderLock(
index,
indexSettings,
paths -> indexFoldersDeletionListeners.beforeIndexFoldersDeleted(index, indexSettings, paths)
paths -> indexFoldersDeletionListeners.beforeIndexFoldersDeleted(
index,
indexSettings,
paths,
delete.reason
)
);
iterator.remove();
} catch (IOException ex) {
Expand All @@ -1442,7 +1454,7 @@ public void processPendingDeletes(Index index, IndexSettings indexSettings, Time
final ShardLock shardLock = locks.get(shardId);
if (shardLock != null) {
try {
deleteShardStore("pending delete", shardLock, delete.settings);
deleteShardStore("pending delete", shardLock, delete.settings, delete.reason);
iterator.remove();
} catch (IOException ex) {
logger.debug(() -> format("%s retry pending delete", shardLock.getShardId()), ex);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import org.elasticsearch.index.Index;
import org.elasticsearch.index.IndexSettings;
import org.elasticsearch.index.shard.ShardId;
import org.elasticsearch.indices.cluster.IndicesClusterStateService.AllocatedIndices.IndexRemovalReason;
import org.elasticsearch.plugins.IndexStorePlugin;

import java.nio.file.Path;
Expand All @@ -31,10 +32,10 @@ public CompositeIndexFoldersDeletionListener(List<IndexStorePlugin.IndexFoldersD
}

@Override
public void beforeIndexFoldersDeleted(Index index, IndexSettings indexSettings, Path[] indexPaths) {
public void beforeIndexFoldersDeleted(Index index, IndexSettings indexSettings, Path[] indexPaths, IndexRemovalReason reason) {
for (IndexStorePlugin.IndexFoldersDeletionListener listener : listeners) {
try {
listener.beforeIndexFoldersDeleted(index, indexSettings, indexPaths);
listener.beforeIndexFoldersDeleted(index, indexSettings, indexPaths, reason);
} catch (Exception e) {
assert false : new AssertionError(e);
throw e;
Expand All @@ -43,10 +44,10 @@ public void beforeIndexFoldersDeleted(Index index, IndexSettings indexSettings,
}

@Override
public void beforeShardFoldersDeleted(ShardId shardId, IndexSettings indexSettings, Path[] shardPaths) {
public void beforeShardFoldersDeleted(ShardId shardId, IndexSettings indexSettings, Path[] shardPaths, IndexRemovalReason reason) {
for (IndexStorePlugin.IndexFoldersDeletionListener listener : listeners) {
try {
listener.beforeShardFoldersDeleted(shardId, indexSettings, shardPaths);
listener.beforeShardFoldersDeleted(shardId, indexSettings, shardPaths, reason);
} catch (Exception e) {
assert false : new AssertionError(e);
throw e;
Expand Down
Loading
Loading