Allow max shard size to be specified when saving pipeline #9440
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
When using
.save_pretrained
to save different modeling components, we lose control over being able to specify max shard size. Being able to store smaller shards of all modeling components without saving each component individually would be a nice control to have. This is particularly useful in the case of CogVideoX where having both text encoder and transformer in shard size of 10GB results in OOM on a Colab CPU. One needs to save the smaller shards in order to make loading the components possible and create the pipeline, which can then be inferred with something likeenable_sequential_cpu_offload
.Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@yiyixuxu @sayakpaul