Skip to content

TestBGPAgentRUD / TestBGPAgentCRUD fails regularly #3380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stephenfin opened this issue May 14, 2025 · 2 comments
Closed

TestBGPAgentRUD / TestBGPAgentCRUD fails regularly #3380

stephenfin opened this issue May 14, 2025 · 2 comments

Comments

@stephenfin
Copy link
Contributor

We often see this test failing in CI due to a timeout while polling for a speaker to be removed from the agent. For example see the logs below from this run. This is almost certainly a bug with neutron. Reporting here so that we have somewhere to track this.


+ /home/runner/work/_temp/cd18f9bc-181b-4e1f-a45e-a2b00573cdca.sh:main:2 :   make acceptance-networking
go run gotest.tools/gotestsum@latest --format testname -- -timeout "60m" -tags "fixtures acceptance" ./internal/acceptance/openstack/networking/...

... {snipped} ...

=== Failed
=== FAIL: internal/acceptance/openstack/networking/v2/extensions/agents TestBGPAgentRUD (906.38s)
    agents_test.go:116: Retrieved BGP agents
    tools.go:84: [
          {
            "id": "d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3",
            "admin_state_up": true,
            "agent_type": "BGP dynamic routing agent",
            "alive": true,
            "resources_synced": false,
            "availability_zone": "",
            "binary": "neutron-bgp-dragent",
            "configurations": {
              "advertise_routes": 0,
              "bgp_peers": 0,
              "bgp_speakers": 0
            },
            "description": "",
            "host": "pkrvmwou917u6c3",
            "topic": "bgp_dragent"
          }
        ]
    speakers.go:24: Attempting to create BGP Speaker: TESTACC-BGPSPEAKER-Jh3LAWZa
    speakers.go:36: Successfully created BGP Speaker
    tools.go:84: {
          "id": "0413078d-9a86-439c-a71c-029ebafbf92c",
          "name": "TESTACC-BGPSPEAKER-Jh3LAWZa",
          "tenant_id": "3a2620d2f7c74208abee541789270688",
          "project_id": "3a2620d2f7c74208abee541789270688",
          "advertise_floating_ip_host_routes": false,
          "advertise_tenant_networks": true,
          "ip_version": 4,
          "local_as": 3000,
          "networks": [],
          "peers": []
        }
    agents_test.go:133: BGP Speaker 0413078d-9a86-439c-a71c-029ebafbf92c has been scheduled to agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3
    agents_test.go:133: BGP Speaker 0413078d-9a86-439c-a71c-029ebafbf92c has been scheduled to agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3
    agents_test.go:133: BGP Speaker 0413078d-9a86-439c-a71c-029ebafbf92c has been scheduled to agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3
    agents_test.go:133: BGP Speaker 0413078d-9a86-439c-a71c-029ebafbf92c has been scheduled to agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3
    agents_test.go:133: BGP Speaker 0413078d-9a86-439c-a71c-029ebafbf92c has been scheduled to agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3
    agents_test.go:133: BGP Speaker 0413078d-9a86-439c-a71c-029ebafbf92c has been scheduled to agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3
    agents_test.go:133: BGP Speaker 0413078d-9a86-439c-a71c-029ebafbf92c has been scheduled to agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3
    agents_test.go:148: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:158: Speakers: 0413078d-9a86-439c-a71c-029ebafbf92c
    agents_test.go:163: BGP Speaker 0413078d-9a86-439c-a71c-029ebafbf92c has been removed from agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    ... {snipping 889 duplicated lines} ...
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:170: Agent d8a42974-c0df-4b7b-a3e0-e40b14b2a9f3 has 1 speaker(s)
    agents_test.go:173: Failure in agents_test.go, line 173: unexpected error "context deadline exceeded"

DONE 97 tests, 11 skipped, 1 failure in 906.554s
make: *** [Makefile:101: acceptance-networking] Error 1
@stephenfin
Copy link
Contributor Author

stephenfin commented May 14, 2025

I discussed this on IRC. From the sounds of it, the test is wrong: we should either (a) be using the static scheduler and creating the association manually or (b) using the chance (dynamic) scheduler and not deleting the association. I'm guessing this only every passed when we won the race between us checking that the association had been deleted and the scheduler recreating it. I have pushed a docs fix to neutron's api-ref to reflect this https://review.opendev.org/c/openstack/neutron-lib/+/949744 I'll work on a fix here now.

stephenfin added a commit to shiftstack/gophercloud that referenced this issue May 14, 2025
Change our deployment so that we use the static scheduler, and rework
the test to handle this. Instead of waiting for the speaker to be
associated with an agent (which won't happen with the static scheduler)
we now jump straight to assigning it.

Signed-off-by: Stephen Finucane <[email protected]>
Closes: gophercloud#3380
@osfrickler
Copy link

If there is only a single agent, I cannot think of a useful scenario where a speaker would not be scheduled to it, so what your test is doing is a bit weird in practical terms. A more realistic scenario would be having two agents, a speaker that is originally scheduled to agent1, and a test that adds the speaker to agent2 and then removes it from agent1. this should work independently of the scheduler being used, but you'd need to set up a multinode devstack deployment for it to work

stephenfin added a commit to shiftstack/gophercloud that referenced this issue May 16, 2025
Change our deployment so that we use the static scheduler, and rework
the test to handle this. Instead of waiting for the speaker to be
associated with an agent (which won't happen with the static scheduler)
we now jump straight to assigning it.

Signed-off-by: Stephen Finucane <[email protected]>
Closes: gophercloud#3380
(cherry picked from commit 669a870)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants