You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Usually crates.io refuses to change existing releases. However, if a crate is deleted (by admins), then it's possible to reclaim the name, and publish the same version of the same crate with a new content.
This means that the index (apart from yanking) isn't truly immutable/append-only. It's possible that for a given (crate_name, version) there will be a different content at different times.
This edge case creates problems:
It complicates registry caching/mirroring. I can't assume that each crate version is published only once ever. Tarballs are downloaded by name & version, but must match the checksum in the index (BTW, static.crates.io sends cache-control: public,max-age=31536000,immutable which isn't correct!). Properly supporting that edge case requires ability to purge caches.
Similarly, it complicates data analysis. I can't assume that data derived from a release won't change, so I need to have ability to update previously processed data. I've assumed that the registry data is append-only and crate tarballs can't change, and ended up with inconsistent data on https://lib.rs for republished crates.
It prevents implementation of an extra TOFU-like security for clients. I wanted to add an extra layer of security to my crates.io mirror (and propose a similar policy in Cargo) that enforced that checksums of published crates must never change. This would ensure that existing releases couldn't be modified even if crates.io itself was hacked, and any attempts to do so could be detected and raise an alarm (I know crates.io is working on a proper security model, but this would have been an improvement without protocol changes). Unfortunately, the possibility of delete + republish legitimately changing the content of old crates is an exception that makes checksums effectively mutable.
Expected Behavior
Would it be possible for crates.io to make checksums always immutable?
I realize this is a complication for anti-spam actions, and ends up "wasting" potentially nice versions on spam. Possible solutions:
Keep deleting crates if needed, but also keep an extra table server-side with (name, version) => checksum mapping that never gets removed. Refuse to publish the same (name, version) if the checksum differs. If a new owner hits this case, tell them to bump the version. I think that's the best, least disruptive solution.
Yank all existing versions, remove all owners, but don't delete the crates. Display these crates as non-existent/deleted on the web front-end, but keep versions and their checksums intact in the index.
I've suggested previously that if crates.io were to give abandoned or squatted crates to new owners, it could forbid the new owners from publishing old semver-compatible versions to prevent new owners from automatically affecting any potential old users (have a concept of a minimum version that can be published, and set it to semver-major+1 of the previous owner's versions). This logic could be reused for deleted & reclaimed crates, although it's not ideal in case of spam.
Current Behavior
Usually crates.io refuses to change existing releases. However, if a crate is deleted (by admins), then it's possible to reclaim the name, and publish the same version of the same crate with a new content.
This means that the index (apart from yanking) isn't truly immutable/append-only. It's possible that for a given
(crate_name, version)
there will be a different content at different times.This edge case creates problems:
It complicates registry caching/mirroring. I can't assume that each crate version is published only once ever. Tarballs are downloaded by name & version, but must match the checksum in the index (BTW, static.crates.io sends
cache-control: public,max-age=31536000,immutable
which isn't correct!). Properly supporting that edge case requires ability to purge caches.Similarly, it complicates data analysis. I can't assume that data derived from a release won't change, so I need to have ability to update previously processed data. I've assumed that the registry data is append-only and crate tarballs can't change, and ended up with inconsistent data on https://lib.rs for republished crates.
It prevents implementation of an extra TOFU-like security for clients. I wanted to add an extra layer of security to my crates.io mirror (and propose a similar policy in Cargo) that enforced that checksums of published crates must never change. This would ensure that existing releases couldn't be modified even if crates.io itself was hacked, and any attempts to do so could be detected and raise an alarm (I know crates.io is working on a proper security model, but this would have been an improvement without protocol changes). Unfortunately, the possibility of delete + republish legitimately changing the content of old crates is an exception that makes checksums effectively mutable.
Expected Behavior
Would it be possible for crates.io to make checksums always immutable?
Steps To Reproduce
Environment
N/A
Anything else?
The checksum changes happened quite a few times:
The text was updated successfully, but these errors were encountered: