This repository was archived by the owner on Jun 3, 2025. It is now read-only.
Fix export of all quantized transformer models #1654
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Our current transformers export-to-onnx pipeline doesn't work for quantized models. To reproduce: try to export any quantized model with the latest versions of sparseml and neuralmagic/transformers.
There are two reasons:
resolved_archive_file=[]
atsparseml/src/sparseml/transformers/sparsification/trainer.py
Line 690 in b73a173
None
instead of[]
.self.model._load_pretrained_model
from transformers returns 6 items (see https://github.com/neuralmagic/transformers/blob/0798c9e3b743a7e5c552f943a1a7d52ff63bbffb/src/transformers/modeling_utils.py#L3323), and in our interface we only accept 5sparseml/src/sparseml/transformers/sparsification/trainer.py
Line 686 in b73a173