Skip to content

[CI] DiskThresholdDeciderIT testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleRestores failing #127711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
elasticsearchmachine opened this issue May 5, 2025 · 1 comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) medium-risk An open issue or test failure that is a medium risk to future releases Team:Distributed Coordination Meta label for Distributed Coordination team >test-failure Triaged test failures from CI

Comments

@elasticsearchmachine
Copy link
Collaborator

elasticsearchmachine commented May 5, 2025

Build Scans:

Reproduction Line:

./gradlew ":server:internalClusterTest" --tests "org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleRestores" -Dtests.seed=29F197D13958671B -Dtests.locale=en-KY -Dtests.timezone=Antarctica/Palmer -Druntime.java=24

Applicable branches:
8.19

Reproduces locally?:
N/A

Failure History:
See dashboard

Failure Message:

java.util.NoSuchElementException: null

Issue Reasons:

  • [8.19] 4 failures in test testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleRestores (5.1% fail rate in 78 executions)
  • [8.19] 2 failures in pipeline elasticsearch-periodic-platform-support (100.0% fail rate in 2 executions)

Note:
This issue was created using new test triage automation. Please report issues or feedback to es-delivery.

@elasticsearchmachine elasticsearchmachine added >test-failure Triaged test failures from CI :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) needs:risk Requires assignment of a risk label (low, medium, blocker) Team:Distributed Coordination Meta label for Distributed Coordination team labels May 5, 2025
@elasticsearchmachine
Copy link
Collaborator Author

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

JeremyDahlgren added a commit that referenced this issue May 5, 2025
The test launches two concurrent restores and wants to verify that
the node with limited disk space is only assigned a single shard from
one of the indices.  The test was asserting that it had one shard from
the first index, but it is possible for it to get one shard from the
index copy instead.  This change allows the shard to be from either
index, but still asserts there is only one assignment to the tiny node.

Closes #127711

(cherry picked from commit 6263f44)
@JeremyDahlgren JeremyDahlgren added medium-risk An open issue or test failure that is a medium risk to future releases and removed needs:risk Requires assignment of a risk label (low, medium, blocker) labels May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) medium-risk An open issue or test failure that is a medium risk to future releases Team:Distributed Coordination Meta label for Distributed Coordination team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

2 participants