Retry shard movements during ESQL query #126653

idegtiarenko · 2025-04-11T06:47:47Z

Today we fail entire query if a shard is unavailable (because it has moved to another node) and there are no other shard copies to retry the query.

This change schedules another shard location resolution round in such cases in order to not fail the query.

Closes: #125947

elasticsearchmachine · 2025-04-11T06:48:11Z

Hi @idegtiarenko, I've created a changelog YAML for you.

idegtiarenko · 2025-04-11T06:52:17Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java

+
+            private void maybeScheduleRetry(ShardId shardId, Exception e) {
+                if (targetShards.getShard(shardId).remainingNodes.isEmpty()
+                    && unwrapFailure(e) instanceof NoShardAvailableActionException) {


This relies on org.elasticsearch.xpack.esql.plugin.DataNodeRequestSender#unwrapFailure that uses

elasticsearch/server/src/main/java/org/elasticsearch/action/support/TransportActions.java

Lines 22 to 30 in a59c182

public static boolean isShardNotAvailableException(final Throwable e) {

final Throwable actual = ExceptionsHelper.unwrapCause(e);

return (actual instanceof ShardNotFoundException

|| actual instanceof IndexNotFoundException

|| actual instanceof IllegalIndexShardStateException

|| actual instanceof NoShardAvailableActionException

|| actual instanceof UnavailableShardsException

|| actual instanceof AlreadyClosedException);

}

Probably this should only retry ShardNotFoundException/NoShardAvailableActionException/UnavailableShardsException. Not sure. Please let me know what do you think.

# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSenderTests.java

idegtiarenko · 2025-04-11T14:59:36Z

...rc/internalClusterTest/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSenderIT.java

+                            .index(index)
+                            .shard(0)
+                            .primaryShard()
+                            .currentNodeId();


I need to check if this could be replaced by a proper API call instead

elasticsearchmachine · 2025-04-14T07:41:49Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

nik9000 · 2025-04-14T15:06:29Z

...rc/internalClusterTest/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSenderIT.java

+        }
+
+        try (EsqlQueryResponse resp = run("FROM " + randomFrom("index-1,index-2", "index-*"))) {
+            assertThat(getValuesList(resp), hasSize(2));


Do we send back the node id that we ran the driver on when we profile? Could we add that as a double check?

I would like to avoid it. We do not really know source for each row nor where shards are currently allocated to.
We would also need to exclude coordinating node as it participate in the query and might or might not contain (or used to contain) shards participating in query.

dnhatn

I left two comments. Thanks @idegtiarenko

dnhatn · 2025-04-14T18:08:12Z

...rc/internalClusterTest/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSenderIT.java

+import static org.elasticsearch.xpack.esql.EsqlTestUtils.getValuesList;
+import static org.hamcrest.Matchers.hasSize;
+
+public class DataNodeRequestSenderIT extends AbstractEsqlIntegTestCase {


Can we make this test similar to SearchWhileRelocatingIT? We continue running ES|QL on one thread while moving shards back and forth between two sets of nodes.

dnhatn · 2025-04-14T18:10:59Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java

+                if (pendingRetries.isEmpty() == false && remainingTargetShardSearchAttempts.decrementAndGet() > 0) {
+                    ongoingTargetShardResolutionAttempts.incrementAndGet();
+                    var indices = pendingRetries.stream().map(ShardId::getIndexName).distinct().toArray(String[]::new);
+                    searchShards(indices, pendingRetries::contains, computeListener.acquireAvoid().delegateFailure((l, newSearchShards) -> {


I think we need to make changes to the search_shards API to allow bypassing can_match in this case. We only need the up-to-date routing table here.

Good point. I think we can have a separate action for it to make it simpler.

nik9000 · 2025-04-15T15:05:54Z

...rc/internalClusterTest/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSenderIT.java

+        }
+
+        try (EsqlQueryResponse resp = run("FROM " + randomFrom("index-1,index-2", "index-*"))) {
+            assertThat(getValuesList(resp), hasSize(2));


nik9000 · 2025-04-15T15:06:25Z

...rc/internalClusterTest/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSenderIT.java

+                    .cluster()
+                    .prepareUpdateSettings(TEST_REQUEST_TIMEOUT, TEST_REQUEST_TIMEOUT)
+                    .setPersistentSettings(Settings.builder().put("cluster.routing.allocation.exclude._name", name))
+                    .get();


dnhatn

I have some more comments. Thanks @idegtiarenko.

dnhatn · 2025-04-15T17:33:16Z

server/src/main/java/org/elasticsearch/action/search/TransportSearchShardsAction.java

@@ -131,11 +131,8 @@ public void searchShards(Task task, SearchShardsRequest searchShardsRequest, Act
            listener.delegateFailureAndWrap((delegate, searchRequest) -> {
                Index[] concreteIndices = resolvedIndices.getConcreteLocalIndices();
                final Set<ResolvedExpression> indicesAndAliases = indexNameExpressionResolver.resolveExpressions(
-


Yeap, looks like a leftover after merging 2adb36e#diff-e9bf1f63e5fb1069f6fd3e4a7fb3b1fa44ff67a60c10dd9bb5f74caa40f2f3e3

dnhatn · 2025-04-15T17:59:05Z

...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlResolveNodesAction.java

+                    shardId,
+                    project.routingTable()
+                        .shardRoutingTable(shardId)
+                        .allShards()


I think we should return only shards with search role only here. Do we need an action for this? Maybe just a helper method in DataNodeRequestSender?

I guess it is possible, although would require a bit more dependency management to bring ClusterService and ProjectResolver to DataNodeRequestSender

dnhatn · 2025-04-15T18:11:57Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java

+                        trySendingRequestsForPendingShards(targetShards, computeListener);
+                        l.onResponse(null);
+                    }));
+                }


The other branch should be in else?

Not necessary. We could make progress without waiting for the moved shard(s) resolution in case there are other shards in queue.

dnhatn · 2025-04-15T18:14:10Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java

+                    ongoingTargetShardResolutionAttempts.incrementAndGet();
+                    resolveShards(pendingRetries, computeListener.acquireAvoid().delegateFailure((l, resolutions) -> {
+                        for (var entry : resolutions.entrySet()) {
+                            targetShards.shards.get(entry.getKey()).remainingNodes.addAll(entry.getValue());


Here we can execute the same target nodes up to 10 times if we hit failed on the same node with unavailable?

I do not follow. If the shard is unavailable it should no longer be in the routing table and should not be passed here.

Here we are using the cluster state on the coordinator, which might not be up to dated.

This is generally okay as we are going to retry more than once.
We could make it transport action again and run on elected master by extending TransportMasterNodeAction but even then there is always a possibility of state changing while we waiting for the response.

dnhatn · 2025-04-15T18:17:02Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java

@@ -433,7 +461,7 @@ void searchShards(
                    skippedShards++;
                    continue;
                }
-                List<DiscoveryNode> allocatedNodes = new ArrayList<>(group.allocatedNodes().size());
+                List<DiscoveryNode> allocatedNodes = Collections.synchronizedList(new ArrayList<>(group.allocatedNodes().size()));


I wonder if we should make the logic that resolves the new target nodes a helper method and call it under sendingLock to avoid handling concurrency.

# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeComputeHandler.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSenderTests.java

dnhatn

Thanks @idegtiarenko.

dnhatn · 2025-04-17T05:56:57Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java

@@ -272,6 +301,7 @@ public void onFailure(Exception e, boolean receivedData) {
                for (ShardId shardId : request.shardIds) {
                    trackShardLevelFailure(shardId, receivedData, e);
                    pendingShardIds.add(shardId);
+                    maybeScheduleRetry(shardId, e);


I think we can retry only if receivedData is false.

dnhatn · 2025-04-17T06:12:02Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java

@@ -466,4 +509,25 @@ void searchShards(
            new ActionListenerResponseHandler<>(searchShardsListener, SearchShardsResponse::new, esqlExecutor)
        );
    }
+
+    void resolveShards(Set<ShardId> shardIds, ActionListener<Map<ShardId, List<DiscoveryNode>>> listener) {
+        ActionListener.completeWith(listener, () -> doResolveShards(shardIds));


Can we just return this without listener?

See the comment below

dnhatn · 2025-04-17T06:19:39Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java

+
+                if (pendingRetries.isEmpty() == false && remainingTargetShardSearchAttempts.getAndDecrement() > 0) {
+                    ongoingTargetShardResolutionAttempts.incrementAndGet();
+                    resolveShards(pendingRetries, computeListener.acquireAvoid().delegateFailure((l, resolutions) -> {


Can we move this retry to trySendingRequestsForPendingShards under sendingLock to avoid handling concurrency?

This is possible in principle, but this would block sending more requests while resolving failure. We do not need to do that necessary.
This also depends if this has to be async (for example if we want to move resolution to the elected master).

dnhatn · 2025-04-17T06:26:59Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java

+                    ongoingTargetShardResolutionAttempts.incrementAndGet();
+                    resolveShards(pendingRetries, computeListener.acquireAvoid().delegateFailure((l, resolutions) -> {
+                        for (var entry : resolutions.entrySet()) {
+                            targetShards.shards.get(entry.getKey()).remainingNodes.addAll(entry.getValue());


Here we are using the cluster state on the coordinator, which might not be up to dated.

dnhatn

Talked to @idegtiarenko offline. He will make changes to address the race condition between retry and sendingRequests. With the upcoming changes, the PR should be good to go. Thank you for all the iterations!

idegtiarenko added 7 commits April 8, 2025 14:14

Test shard movements during query

63b19d6

improve test

90f33b8

cleanup

b1a04ff

fmt

545c50d

make searchShards reusable

8b5f590

implement retry

406df8d

minor cleanups

1e3e92d

idegtiarenko added >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.1.0 labels Apr 11, 2025

idegtiarenko requested review from nik9000 and dnhatn April 11, 2025 06:47

idegtiarenko marked this pull request as draft April 11, 2025 06:48

Update docs/changelog/126653.yaml

f7916f7

idegtiarenko commented Apr 11, 2025

View reviewed changes

idegtiarenko mentioned this pull request Apr 11, 2025

Simplify DataNodeRequestSender #126664

Merged

idegtiarenko added 4 commits April 11, 2025 12:38

request only relevant indices

068b6be

limit retry attempts

a031270

Merge branch 'main' into retry_shard_movements_during_query

1453368

# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSenderTests.java

make it possible to inject shard resolution logic

4220622

idegtiarenko commented Apr 11, 2025

View reviewed changes

idegtiarenko added 5 commits April 11, 2025 17:00

select random pattern

2ea3292

add unit tests

782c595

Merge branch 'main' into retry_shard_movements_during_query

8d2293e

separate test for retrying only relevant shards

54c0656

remove todo

811d8fc

idegtiarenko marked this pull request as ready for review April 14, 2025 07:41

Merge branch 'main' into retry_shard_movements_during_query

c20d53c

nik9000 reviewed Apr 14, 2025

View reviewed changes

dnhatn reviewed Apr 14, 2025

View reviewed changes

idegtiarenko added 2 commits April 15, 2025 14:13

do can match only once

56128ee

testSearchWhileRelocating

9e2bb85

idegtiarenko requested review from dnhatn and nik9000 April 15, 2025 14:13

nik9000 approved these changes Apr 15, 2025

View reviewed changes

dnhatn reviewed Apr 15, 2025

View reviewed changes

idegtiarenko added 6 commits April 16, 2025 09:18

Merge branch 'main' into retry_shard_movements_during_query

333ed60

move logic to DataNodeComputeHandler

b8b4a95

update remaining nodes under the lock

503f13e

upd

e660001

Merge branch 'main' into retry_shard_movements_during_query

0908649

idegtiarenko force-pushed the retry_shard_movements_during_query branch from 39ada69 to b2819b6 Compare April 16, 2025 16:03

dnhatn reviewed Apr 17, 2025

View reviewed changes

retry only when not received data

e899a10

dnhatn approved these changes Apr 17, 2025

View reviewed changes

idegtiarenko added 3 commits April 22, 2025 11:02

make resolution sync

e298aaf

limit retry attempts

bb67124

Merge branch 'main' into retry_shard_movements_during_query

ef0ffef

idegtiarenko force-pushed the retry_shard_movements_during_query branch from b53d64c to ef0ffef Compare April 22, 2025 09:40

idegtiarenko merged commit 3a6963a into elastic:main Apr 22, 2025
17 checks passed

idegtiarenko deleted the retry_shard_movements_during_query branch April 22, 2025 11:15

This was referenced Apr 22, 2025

DataNodeRequestSenderIT.testSearchWhileRelocating failing #127163

Closed

[CI] DataNodeRequestSenderTests testRetryOnlyMovedShards failing #127168

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry shard movements during ESQL query #126653

Retry shard movements during ESQL query #126653

idegtiarenko commented Apr 11, 2025 •

edited

Loading

elasticsearchmachine commented Apr 11, 2025

idegtiarenko Apr 11, 2025

idegtiarenko Apr 11, 2025

elasticsearchmachine commented Apr 14, 2025

nik9000 Apr 14, 2025

idegtiarenko Apr 15, 2025 •

edited

Loading

nik9000 Apr 15, 2025

dnhatn left a comment

dnhatn Apr 14, 2025

idegtiarenko Apr 15, 2025

dnhatn Apr 14, 2025

idegtiarenko Apr 15, 2025

nik9000 Apr 15, 2025

nik9000 Apr 15, 2025

dnhatn left a comment

dnhatn Apr 15, 2025

idegtiarenko Apr 16, 2025

dnhatn Apr 15, 2025

idegtiarenko Apr 16, 2025

dnhatn Apr 15, 2025

idegtiarenko Apr 16, 2025

dnhatn Apr 15, 2025 •

edited

Loading

idegtiarenko Apr 16, 2025

dnhatn Apr 17, 2025

idegtiarenko Apr 17, 2025

dnhatn Apr 15, 2025

dnhatn left a comment

dnhatn Apr 17, 2025

dnhatn Apr 17, 2025

idegtiarenko Apr 17, 2025

dnhatn Apr 17, 2025

idegtiarenko Apr 17, 2025 •

edited

Loading

dnhatn Apr 17, 2025

dnhatn left a comment

	public static boolean isShardNotAvailableException(final Throwable e) {
	final Throwable actual = ExceptionsHelper.unwrapCause(e);
	return (actual instanceof ShardNotFoundException
	\|\| actual instanceof IndexNotFoundException
	\|\| actual instanceof IllegalIndexShardStateException
	\|\| actual instanceof NoShardAvailableActionException
	\|\| actual instanceof UnavailableShardsException
	\|\| actual instanceof AlreadyClosedException);
	}

Retry shard movements during ESQL query #126653

Retry shard movements during ESQL query #126653

Conversation

idegtiarenko commented Apr 11, 2025 • edited Loading

elasticsearchmachine commented Apr 11, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticsearchmachine commented Apr 14, 2025

Choose a reason for hiding this comment

idegtiarenko Apr 15, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnhatn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnhatn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnhatn Apr 15, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnhatn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

idegtiarenko Apr 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnhatn left a comment

Choose a reason for hiding this comment

idegtiarenko commented Apr 11, 2025 •

edited

Loading

idegtiarenko Apr 15, 2025 •

edited

Loading

dnhatn Apr 15, 2025 •

edited

Loading

idegtiarenko Apr 17, 2025 •

edited

Loading