Skip to content

[CI] SimpleBlocksIT testConcurrentAddBlock failing #122324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
elasticsearchmachine opened this issue Feb 11, 2025 · 8 comments · May be fixed by #126918
Open

[CI] SimpleBlocksIT testConcurrentAddBlock failing #122324

elasticsearchmachine opened this issue Feb 11, 2025 · 8 comments · May be fixed by #126918
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. medium-risk An open issue or test failure that is a medium risk to future releases Team:Distributed Indexing Meta label for Distributed Indexing team >test-failure Triaged test failures from CI

Comments

@elasticsearchmachine
Copy link
Collaborator

Build Scans:

Reproduction Line:

./gradlew ":server:internalClusterTest" --tests "org.elasticsearch.blocks.SimpleBlocksIT.testConcurrentAddBlock" -Dtests.seed=29531456CD95A41C -Dtests.locale=ff-Latn-BF -Dtests.timezone=CET -Druntime.java=23

Applicable branches:
main

Reproduces locally?:
N/A

Failure History:
See dashboard

Failure Message:

java.lang.AssertionError: java.util.concurrent.ExecutionException: java.lang.AssertionError: 
Expected: is <true>
     but: was <false>

Issue Reasons:

  • [main] 2 failures in test testConcurrentAddBlock (1.1% fail rate in 178 executions)

Note:
This issue was created using new test triage automation. Please report issues or feedback to es-delivery.

@elasticsearchmachine elasticsearchmachine added :Core/Infra/Core Core issues without another label >test-failure Triaged test failures from CI labels Feb 11, 2025
@elasticsearchmachine
Copy link
Collaborator Author

This has been muted on branch main

Mute Reasons:

  • [main] 2 failures in test testConcurrentAddBlock (1.1% fail rate in 178 executions)

Build Scans:

@elasticsearchmachine
Copy link
Collaborator Author

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine elasticsearchmachine added Team:Core/Infra Meta label for core/infra team needs:risk Requires assignment of a risk label (low, medium, blocker) labels Feb 11, 2025
@ldematte ldematte added :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. and removed :Core/Infra/Core Core issues without another label labels Feb 12, 2025
@elasticsearchmachine
Copy link
Collaborator Author

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@elasticsearchmachine elasticsearchmachine added Team:Distributed Coordination Meta label for Distributed Coordination team and removed Team:Core/Infra Meta label for core/infra team labels Feb 12, 2025
@ldematte
Copy link
Contributor

I think that blocks could be an area covered by distributed, but please send it back if I'm wrong.

@nicktindall
Copy link
Contributor

The fingerprints on that test point to us being owners. We'll take it.

@nicktindall nicktindall added medium-risk An open issue or test failure that is a medium risk to future releases and removed needs:risk Requires assignment of a risk label (low, medium, blocker) labels Feb 13, 2025
@nicktindall
Copy link
Contributor

We apparently are able to add a block and then the cluster state indicates we don't have it. Seems potentially dangerous.

@pxsalehi pxsalehi self-assigned this Feb 13, 2025
@pxsalehi
Copy link
Member

I have gotten one more failure out of this: https://gradle-enterprise.elastic.co/s/kmqodxa5xlous. To me this seems related to the recent work on N-2 support and especially changes done in #119743. As I'm not up to date to that work, I'll relabel this for the indexing team.

@pxsalehi pxsalehi added :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. Team:Distributed Coordination Meta label for Distributed Coordination team labels Feb 18, 2025
@pxsalehi pxsalehi removed their assignment Feb 18, 2025
@pxsalehi pxsalehi added the Team:Distributed Indexing Meta label for Distributed Indexing team label Feb 18, 2025
@elasticsearchmachine
Copy link
Collaborator Author

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@arteam arteam self-assigned this Feb 19, 2025
arteam added a commit to arteam/elasticsearch that referenced this issue Feb 20, 2025
@arteam arteam removed their assignment Apr 15, 2025
arteam added a commit to arteam/elasticsearch that referenced this issue Apr 16, 2025
Only one of concurrent `prepareAddBlock` calls actually wins a race to
add the block, so  we need to check `assertIndexHasBlock` if
the add index block has been acknowledged

Resolve elastic#122324
@arteam arteam linked a pull request Apr 16, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. medium-risk An open issue or test failure that is a medium risk to future releases Team:Distributed Indexing Meta label for Distributed Indexing team >test-failure Triaged test failures from CI
Projects
None yet
5 participants