[CI] RepositoryAnalysisFailureIT testFailsOnWriteException failing #126747

elasticsearchmachine · 2025-04-12T21:45:28Z

Build Scans:

Reproduction Line:

./gradlew ":x-pack:plugin:snapshot-repo-test-kit:internalClusterTest" --tests "org.elasticsearch.repositories.blobstore.testkit.analyze.RepositoryAnalysisFailureIT.testFailsOnWriteException" -Dtests.seed=5E27405E3C54F01 -Dtests.locale=es-GQ -Dtests.timezone=America/Chicago -Druntime.java=24

Applicable branches:
main

Reproduces locally?:
N/A

Failure History:
See dashboard

Failure Message:

java.lang.AssertionError: safeGet: listener was not completed within the timeout

Issue Reasons:

[main] 4 failures in test testFailsOnWriteException (0.5% fail rate in 795 executions)
[main] 2 failures in step part-2 (0.8% fail rate in 249 executions)
[main] 2 failures in pipeline elasticsearch-pull-request (0.8% fail rate in 248 executions)

Note:
This issue was created using new test triage automation. Please report issues or feedback to es-delivery.

The text was updated successfully, but these errors were encountered:

…oryAnalysisFailureIT testFailsOnWriteException #126747

elasticsearchmachine · 2025-04-12T21:45:32Z

This has been muted on branch main

Mute Reasons:

[main] 3 failures in test testFailsOnWriteException (0.4% fail rate in 682 executions)
[main] 2 failures in step part-2 (0.8% fail rate in 243 executions)
[main] 2 failures in pipeline elasticsearch-pull-request (0.8% fail rate in 242 executions)

Build Scans:

elasticsearchmachine · 2025-04-12T21:46:02Z

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

With the addition of copy coverage in the repository analyzer, blob count is no longer 1:1 with blob analyzer request count: requests that create a copy count as two blobs. This can cause testFailsOnWriteException to sometimes fail, because this test randomly injects a failure somewhere between the first and blobCounth request, which may never happen if enough of the requests create copies. This simple fix is to inject the failure within blobCount/2 requests, which we will see even if every request generates a copy. An alternative could be to add a knob to the request to disallow copies and use that during this test. Closes elastic#126747

With the addition of copy coverage in the repository analyzer, blob count is no longer 1:1 with blob analyzer request count: requests that create a copy count as two blobs. This can cause testFailsOnWriteException to sometimes fail, because this test randomly injects a failure somewhere between the first and blobCounth request, which may never happen if enough of the requests create copies. This simple fix is to inject the failure within blobCount/2 requests, which we will see even if every request generates a copy. An alternative could be to add a knob to the request to disallow copies and use that during this test. Closes #126747

The fix to elastic#126747 was only for one test. This applies that change to all the tests in this suite that need it.

Fixes elastic#127029 The fix to elastic#126747 was only for one test. This applies that change to all the tests in this suite that need it.

The fix to #126747 was only for one test. This applies that change to all the tests in this suite that need it. Fixes #127029

elasticsearchmachine added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >test-failure Triaged test failures from CI labels Apr 12, 2025

elasticsearchmachine added a commit that referenced this issue Apr 12, 2025

Mute org.elasticsearch.repositories.blobstore.testkit.analyze.Reposit…

4ed1a00

…oryAnalysisFailureIT testFailsOnWriteException #126747

elasticsearchmachine added needs:risk Requires assignment of a risk label (low, medium, blocker) Team:Distributed Coordination Meta label for Distributed Coordination team labels Apr 12, 2025

bcully mentioned this issue Apr 13, 2025

RepositoryAnalysisFailureIT: fix testFailsOnWriteException #126750

Merged

bcully closed this as completed in #126750 Apr 14, 2025

bcully added the rca:random-controlled test failed due to randomization, and is reproducible given the seed label Apr 14, 2025

bcully mentioned this issue Apr 17, 2025

[CI] RepositoryAnalysisFailureIT testFailsOnReadError failing #127029

Closed

bcully added a commit to bcully/elasticsearch that referenced this issue Apr 17, 2025

RepositoryAnalysisFailureIT: disrupt earlier

88b3efc

The fix to elastic#126747 was only for one test. This applies that change to all the tests in this suite that need it.

bcully added a commit to bcully/elasticsearch that referenced this issue Apr 17, 2025

RepositoryAnalysisFailureIT: disrupt earlier

d2339da

Fixes elastic#127029 The fix to elastic#126747 was only for one test. This applies that change to all the tests in this suite that need it.

bcully mentioned this issue Apr 17, 2025

RepositoryAnalysisFailureIT: disrupt earlier #127032

Merged

bcully added a commit that referenced this issue Apr 18, 2025

RepositoryAnalysisFailureIT: disrupt earlier (#127032)

26254e3

The fix to #126747 was only for one test. This applies that change to all the tests in this suite that need it. Fixes #127029

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] RepositoryAnalysisFailureIT testFailsOnWriteException failing #126747

[CI] RepositoryAnalysisFailureIT testFailsOnWriteException failing #126747

elasticsearchmachine commented Apr 12, 2025 •

edited

Loading

elasticsearchmachine commented Apr 12, 2025

elasticsearchmachine commented Apr 12, 2025

[CI] RepositoryAnalysisFailureIT testFailsOnWriteException failing #126747

[CI] RepositoryAnalysisFailureIT testFailsOnWriteException failing #126747

Comments

elasticsearchmachine commented Apr 12, 2025 • edited Loading

elasticsearchmachine commented Apr 12, 2025

elasticsearchmachine commented Apr 12, 2025

elasticsearchmachine commented Apr 12, 2025 •

edited

Loading