-
Notifications
You must be signed in to change notification settings - Fork 58
Description
Today, requests to stop a running instance must by necessity involve the instance's active Propolis (sled agent sends the stop request to Propolis; the instance is stopped when Propolis says so, at which point sled agent cleans up the Propolis zone and all related objects). If an instance's Propolis is not responding, or there is no active Propolis, there is no obvious way to clean up the instance and recover.
A short-term workaround is to grant some form of API access to sled agent's "unregister instance" API, which forcibly executes the termination path (tearing down the Propolis zone and removing the instance from the sled's instance table) and can get force an instance into a stopped state.
In the long run instance lifecycle management needs to be made more robust to Propolis failure and/or non-responsiveness.