Description
Nomad clients running on macOS do not resume normal task execution after the Nomad server is restarted, even though they reconnect and appear ready. In contrast, Ubuntu clients reconnect and continue accepting jobs as expected.
Nomad version
Nomad v1.6.2
BuildDate 2023-09-13T16:47:25Z
Revision 73e372a
Operating system and Environment details
ProductName: macOS
ProductVersion: 14.5
BuildVersion: 23F79
Issue
I'm running nomad server as a statefulset. The clients are running on a pool of machines (macos servers).
Nomad clients running on macOS do not resume normal task execution after the Nomad server is restarted, even though they reconnect and appear ready. In contrast, Ubuntu clients reconnect and continue accepting jobs as expected.
Reproduction steps
- Restart the nomad server
- Schedule a new job on the server
- The clients will connect to the server but the new allocations which go into pending state.
Expected Result
- When the Nomad server restarts, all connected Nomad clients (Linux/macOS) should reconnect.
- Jobs submitted after server restart should be accepted and placed on any eligible client.
- Client nodes in ready state should be able to run new tasks.
Actual Result
- After restarting the Nomad server, the macOS Nomad client:
- Reconnects to the server
- Shows up as ready and eligible
- But jobs submitted afterward remain in pending state when placed on the macOS client
- The same job runs fine when targeted to a Linux (Ubuntu) client.
Restarting the Nomad client process on macOS immediately fixes the issue — jobs get placed and run correctly.
Logs:
nomad node status
on macOS shows:
Status: ready
Eligibility: eligible
Allocated Resources: 0
Driver Status: raw_exec
Allocation remains pending
(nomad job status destroy_job_tafbw3ybdhpwzfaq1aep
):
ID = destroy_job_tafbw3ybdhpwzfaq1aep
Name = 2cZmM4V7xTTQcugQlT1c
Submit Date = 2025-06-19T23:08:51-07:00
Type = batch
Priority = 50
Datacenters = dc1
Namespace = default
Node Pool = <none>
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost Unknown
delete_task_group_tafbw3ybdhpwzfaq1aep 0 1 0 0 0 0 0
Allocations
ID Node ID Task Group Version Desired Status Created Modified
cb09f4cb 5e777097 delete_task_group_tafbw3ybdhpwzfaq1aep 4 run pending 1h2m ago 49m51s ago
nomad alloc status cb09f4cb
ID = cb09f4cb-f01d-c1ce-becc-711b5ace0b6d
Eval ID = df69462a
Name = destroy_job_tafbw3ybdhpwzfaq1aep.delete_task_group_tafbw3ybdhpwzfaq1aep[0]
Node ID = 5e777097
Node Name = 67604.local
Job ID = destroy_job_tafbw3ybdhpwzfaq1aep
Job Version = 4
Client Status = pending
Client Description = <none>
Desired Status = run
Desired Description = <none>
Created = 1h3m ago
Modified = 50m33s ago
Couldn't retrieve stats: Unexpected response code: 404 (rpc error: Unknown allocation "cb09f4cb-f01d-c1ce-becc-711b5ace0b6d")
Metadata
Metadata
Assignees
Type
Projects
Status