Skip to content

want sled-agent timeout on completion of requests to stop a Propolis VM #4004

@gjcolombo

Description

@gjcolombo

Today, requests to stop a running instance must by necessity involve the instance's active Propolis (sled agent sends the stop request to Propolis; the instance is stopped when Propolis says so, at which point sled agent cleans up the Propolis zone and all related objects). If an instance's Propolis is not responding, or there is no active Propolis, there is no obvious way to clean up the instance and recover.

A short-term workaround is to grant some form of API access to sled agent's "unregister instance" API, which forcibly executes the termination path (tearing down the Propolis zone and removing the instance from the sled's instance table) and can get force an instance into a stopped state.

In the long run instance lifecycle management needs to be made more robust to Propolis failure and/or non-responsiveness.

Metadata

Metadata

Assignees

Labels

Sled AgentRelated to the Per-Sled Configuration and Managementknown issueTo include in customer documentation and trainingnexusRelated to nexus

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions