Skip to content

Fix AsyncSearchActionIT tests #127010

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Apr 17, 2025

Missed a spot here when moving this to delayed deserialization, we can leak pending batch results here on exceptions.
Just making this poll from a queue instead of nulling out in a list to keep it simple.

I'll look into a follow-up to clean up the horrible patterns used in QueryPhaseResultConsumer that keep bringing this kind of failure up over and over. Mutable collections like that are just painful to get right, this can be solved cleaner with our futures framework these days.

closes #126994
closes #126995
closes #126975
closes #126999
closes #127001
closes #126974
closes #127008

Missed a spot here when moving this to delayed deserialization, we can leak pending batch results here on exceptions.

closes elastic#126994
closes elastic#126995
closes elastic#126975
closes elastic#126999
closes elastic#127001
closes elastic#126974
closes elastic#127008
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Apr 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@smalyshev
Copy link
Contributor

I'm wondering about something here - with this bug, the tests fail in 1% of cases. Can we make a test that would always fail for this bug? So that we could rely on CI to always catch it in time, not post-factum.

@original-brownbear
Copy link
Member Author

@smalyshev it's hard to make it fail deterministically since it's such a small number of tests in this run and we aren't guaranteed to have a GC run before the test finishes.
One possible option to make this fail more reliably would be to reduce the heap size for the suite, since we don't actually have enouhg GC for the leak tracker to run that should be possible. I think I might have a globally applicable idea here :)

@original-brownbear original-brownbear enabled auto-merge (squash) April 17, 2025 16:01
@original-brownbear
Copy link
Member Author

Thanks Stas!

@original-brownbear original-brownbear merged commit 5a3c9e7 into elastic:main Apr 17, 2025
16 of 17 checks passed
@original-brownbear original-brownbear deleted the fix-async-search-its branch April 17, 2025 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment