Skip to content

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Feb 19, 2025

Marking an instance deleted, within the "instance deletion saga", invokes datastore.project_delete_instance.

This API performs the following tasks:

  1. Marks the instance record "Destroyed", and detaches all disks
  2. Performs a variety of other cleanup tasks (deleting SSH keys, not-yet-but-soon cleaning up affinity groups, marking migrations as deleted).

These operations are not in a transaction, so it's possible for step (1) to succeed, but step (2) to not happen.
In this case, it's critical that a second invocation of this API be able to actually perform cleanup.

This PR adjusts the implementation of project_delete_instance to make this actually true.

@smklein smklein changed the title Safer instance deletion [nexus] Make Instance Deletion actually idempotent Feb 19, 2025
@smklein smklein changed the title [nexus] Make Instance Deletion actually idempotent [9/5] [nexus] Make Instance Deletion actually idempotent Feb 19, 2025
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch with this one! The new code feels substantially less sketchy to me.

I left a few nitpicky comments, but I have no major concerns about the change and would like to see this landed! I notice that this is currently stacked on top of your affinity work --- would it make sense to base this branch on main instead and land the fix before the affinity stuff?

@smklein smklein marked this pull request as ready for review February 19, 2025 18:38
@smklein smklein force-pushed the safer-instance-deletion branch from c428650 to 83516c8 Compare February 19, 2025 18:59
@smklein smklein changed the base branch from vmm-reduce-contention to main February 19, 2025 18:59
@smklein smklein changed the title [9/5] [nexus] Make Instance Deletion actually idempotent [nexus] Make Instance Deletion actually idempotent Feb 19, 2025
@smklein
Copy link
Collaborator Author

smklein commented Feb 19, 2025

I left a few nitpicky comments, but I have no major concerns about the change and would like to see this landed! I notice that this is currently stacked on top of your affinity work --- would it make sense to base this branch on main instead and land the fix before the affinity stuff?

Sure thing, pulled it out, will merge first

@smklein smklein enabled auto-merge (squash) February 19, 2025 19:04
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new commentary looks good to me!

@smklein smklein merged commit ee1b02e into main Feb 20, 2025
16 checks passed
@smklein smklein deleted the safer-instance-deletion branch February 20, 2025 01:42
hawkw pushed a commit that referenced this pull request Feb 21, 2025
Marking an instance deleted, within the "instance deletion saga",
invokes `datastore.project_delete_instance`.

This API performs the following tasks:
1. Marks the instance record "Destroyed", and detaches all disks
2. Performs a variety of other cleanup tasks (deleting SSH keys,
not-yet-but-soon cleaning up affinity groups, marking migrations as
deleted).

These operations are not in a transaction, so it's possible for step (1)
to succeed, but step (2) to not happen.
In this case, it's critical that a second invocation of this API be able
to actually perform cleanup.

This PR adjusts the implementation of `project_delete_instance` to make
this actually true.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants