Skip to content

streamer, tests: use the non-overlapping socket bind api #6895

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

puhtaytow
Copy link

@puhtaytow puhtaytow commented Jul 9, 2025

Problem

A bunch of tests still relies on either hardcoded ports or bind_to_localhost and bind_to_unspecified. This creates flaky tests in cases where unique port ranges are needed.

Related to #6886

Summary of Changes

Swapped to the new localhost_port_range_for_tests.

@0xbrw
Copy link

0xbrw commented Jul 9, 2025

@alexpyattaev or @apfitzge when you get a chance, can I have you take a look?

@apfitzge
Copy link

apfitzge commented Jul 9, 2025

It looks correct to me on skim, but it's a bit verbose to do this everywhere. Can we not have a bind_to_localhost_for_tests that just accesses some static behind a mutex that is lazily initialized with this iterator? That'd make a lot of these simple function changes instead of having like 4-5 extra lines per change.

@puhtaytow
Copy link
Author

puhtaytow commented Jul 10, 2025

It looks correct to me on skim, but it's a bit verbose to do this everywhere. Can we not have a bind_to_localhost_for_tests that just accesses some static behind a mutex that is lazily initialized with this iterator? That'd make a lot of these simple function changes instead of having like 4-5 extra lines per change.

I looked at the https://github.com/anza-xyz/agave/blob/master/net-utils/src/sockets.rs#L27-L45 and to me, the function does utilize the nextest env variable to coordinate the tests globally, there is also the atomic SLICE.. it looks like we could skip mutex in this case.

Obviously, correct me if im wrong 🙏

pub fn localhost_port_range_for_tests() -> (u16, u16) {
    static SLICE: AtomicU16 = AtomicU16::new(0);
    let offset = SLICE.fetch_add(20, Ordering::Relaxed);
    let start = offset
        + match std::env::var("NEXTEST_TEST_GLOBAL_SLOT") {
            Ok(slot) => {
                let slot: u16 = slot.parse().unwrap();
                assert!(
                    offset < SLICE_PER_PROCESS,
                    "Overrunning into the port range of another test! Consider using fewer ports per test."
                );
                BASE_PORT + slot * SLICE_PER_PROCESS
            }
            Err(_) => BASE_PORT,
        };
    assert!(start < u16::MAX - 20, "ran out of port numbers!");
    (start, start + 20)
}

To address the unnecessary verbosity, i introduced the helper function in net-utils as per your suggestion. https://github.com/anza-xyz/agave/compare/master...puhtaytow:agave:streamer-non-overlapping-socket-bind-api-1?expand=1

Please tell me if that is the correct place for it and if so then how would you like to merge the changes to net-utils, as separate PR?

Thank you for the suggestion 🙏 i do appreciate a lot any constructive input.

@alexpyattaev
Copy link

Please do not fix what is not broken =) localhost_port_range_for_tests uses both nextest env variables AND local atomics to coordinate port allocation because we do not know if it will run under cargo test or cargo nextest. cargo test uses 1 process for all tests => atomics work. cargo nextest uses process-per-test model, and thus we need to use env variables. One possible improvement for that function was suggested by t-nelson that would use shared memory instead of env variables for inter-process coordination, as that would allow the same mechanism to be used under both test runners.

Copy link

@alexpyattaev alexpyattaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us not create mutable iterators where direct access would achieve the same result=)

@puhtaytow
Copy link
Author

@alexpyattaev thank you for pointing these out / fixed 🙏

@puhtaytow puhtaytow requested a review from alexpyattaev July 10, 2025 12:27
@alexpyattaev alexpyattaev added the CI Pull Request is ready to enter CI label Jul 11, 2025
@anza-team anza-team removed the CI Pull Request is ready to enter CI label Jul 11, 2025
Copy link

@alexpyattaev alexpyattaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thank you!

@puhtaytow
Copy link
Author

LGTM thank you!

Thank you! 🙏

@alexpyattaev alexpyattaev force-pushed the streamer-non-overlapping-socket-bind-api-0 branch from ecf24b2 to 08d3893 Compare July 14, 2025 07:00
@alexpyattaev alexpyattaev added the CI Pull Request is ready to enter CI label Jul 14, 2025
@anza-team anza-team removed the CI Pull Request is ready to enter CI label Jul 14, 2025
@alexpyattaev
Copy link

This is currently blocked by #6957

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants