[Test] More reliable wait for index to appear #126437

ywangd · 2025-04-08T02:14:45Z

Relates: #125652
Resolves: #126204

Relates: elastic#125652 Resolves: elastic#126204

elasticsearchmachine · 2025-04-08T02:15:09Z

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

DaveCTurner

Couple of nits otherwise LGTM.

There's a few other places we call indexExists() in an assertBusy() which probably want the same treatment.

DaveCTurner · 2025-04-08T06:08:19Z

...src/internalClusterTest/java/org/elasticsearch/snapshots/SharedClusterSnapshotRestoreIT.java

@@ -748,7 +748,7 @@ public void testDeletionOfFailingToRecoverIndexShouldStopRestore() throws Except

        logger.info("--> wait for the index to appear");
        // that would mean that recovery process started and failing
-        waitForIndex("test-idx", TimeValue.timeValueSeconds(10));


This is the only caller of waitForIndex so I suggest removing the method itself too.

DaveCTurner · 2025-04-08T06:11:35Z

...src/internalClusterTest/java/org/elasticsearch/snapshots/SharedClusterSnapshotRestoreIT.java

@@ -748,7 +748,7 @@ public void testDeletionOfFailingToRecoverIndexShouldStopRestore() throws Except

        logger.info("--> wait for the index to appear");
        // that would mean that recovery process started and failing
-        waitForIndex("test-idx", TimeValue.timeValueSeconds(10));
+        safeGet(clusterAdmin().prepareHealth(TEST_REQUEST_TIMEOUT, "test-idx").execute());


No need to set a 30s timeout on the request if we're only waiting for 10s:

Suggested change

safeGet(clusterAdmin().prepareHealth(TEST_REQUEST_TIMEOUT, "test-idx").execute());

safeGet(clusterAdmin().prepareHealth(SAFE_AWAIT_TIMEOUT, "test-idx").execute());

Relates: elastic#126437

nielsbauman · 2025-04-09T07:19:29Z

As far as I am concerned, the cluster health API is also a candidate for being run on the local node by default - it's also listed in #101805. I still think that's valuable and worth doing, but that API will probably need the option to run on the master node, as we'll want to be able to wait for all tasks to complete on the master node - at least in tests, not sure how valuable that is in production.

This is not something we need to discuss on this PR - I only brought it up here because I noticed we're using the cluster health API here to check the index health - but we might need to discuss an approach sometime later.

DaveCTurner · 2025-04-09T07:31:45Z

++ yeah the cluster health API is kind of a weird one, it's so heavily used in tests to wait for the master node state that I think making it node-local will be fraught with difficulty. We could for instance wait for the node-local state and then reach out to the master to wait for the application to complete? But also it's not particularly heavyweight, it might not be worth the effort to migrate it.

Relates: #126437 (review)

[Test] More reliable wait for index to appear

bf44b33

Relates: elastic#125652 Resolves: elastic#126204

ywangd added >test Issues or PRs that are addressing/adding tests :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v9.1.0 labels Apr 8, 2025

ywangd requested a review from DaveCTurner April 8, 2025 02:14

elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Apr 8, 2025

ywangd mentioned this pull request Apr 8, 2025

[CI] SharedClusterSnapshotRestoreIT testDeletionOfFailingToRecoverIndexShouldStopRestore failing #126204

Closed

DaveCTurner approved these changes Apr 8, 2025

View reviewed changes

ywangd added 2 commits April 9, 2025 13:20

review comments

119896b

Merge remote-tracking branch 'origin/main' into es-126204-fix

eba508b

ywangd added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Apr 9, 2025

fix merge

39d2270

elasticsearchmachine merged commit 4f9bfb0 into elastic:main Apr 9, 2025
17 checks passed

ywangd deleted the es-126204-fix branch April 9, 2025 04:31

ywangd added a commit to ywangd/elasticsearch that referenced this pull request Apr 9, 2025

Replace assertBusy of indexExists

123e9c9

Relates: elastic#126437

ywangd mentioned this pull request Apr 9, 2025

Replace assertBusy of indexExists #126501

Merged

elasticsearchmachine pushed a commit that referenced this pull request Apr 10, 2025

Replace assertBusy of indexExists (#126501)

62636f9

Relates: #126437 (review)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Test] More reliable wait for index to appear #126437

[Test] More reliable wait for index to appear #126437

Uh oh!

ywangd commented Apr 8, 2025

Uh oh!

elasticsearchmachine commented Apr 8, 2025

Uh oh!

DaveCTurner left a comment

Uh oh!

DaveCTurner Apr 8, 2025

Uh oh!

DaveCTurner Apr 8, 2025

Uh oh!

Uh oh!

nielsbauman commented Apr 9, 2025

Uh oh!

DaveCTurner commented Apr 9, 2025

Uh oh!

Uh oh!

	safeGet(clusterAdmin().prepareHealth(TEST_REQUEST_TIMEOUT, "test-idx").execute());
	safeGet(clusterAdmin().prepareHealth(SAFE_AWAIT_TIMEOUT, "test-idx").execute());

[Test] More reliable wait for index to appear #126437

[Test] More reliable wait for index to appear #126437

Uh oh!

Conversation

ywangd commented Apr 8, 2025

Uh oh!

elasticsearchmachine commented Apr 8, 2025

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nielsbauman commented Apr 9, 2025

Uh oh!

DaveCTurner commented Apr 9, 2025

Uh oh!

Uh oh!