Fix export of all quantized transformer models #1654

eldarkurtic · 2023-07-06T20:11:10Z

Our current transformers export-to-onnx pipeline doesn't work for quantized models. To reproduce: try to export any quantized model with the latest versions of sparseml and neuralmagic/transformers.

There are two reasons:

we are passing in resolved_archive_file=[] at

sparseml/src/sparseml/transformers/sparsification/trainer.py

Line 690 in b73a173

resolved_archive_file=[],

which causes error in HF library here: https://github.com/neuralmagic/transformers/blob/0798c9e3b743a7e5c552f943a1a7d52ff63bbffb/src/transformers/modeling_utils.py#L3150 . We can fix this by passing None instead of [].
the function self.model._load_pretrained_model from transformers returns 6 items (see https://github.com/neuralmagic/transformers/blob/0798c9e3b743a7e5c552f943a1a7d52ff63bbffb/src/transformers/modeling_utils.py#L3323), and in our interface we only accept 5

sparseml/src/sparseml/transformers/sparsification/trainer.py

Line 686 in b73a173

_, missing, unexpected, _, _ = self.model._load_pretrained_model(

which also raises an issue while unpacking the returned tuple from their function.

bfineran

great catch @eldarkurtic this should also fix our broken tests on main. Thank you!

eldarkurtic · 2023-07-06T20:34:13Z

Accidentally pushed changes from this PR (#1630) as well to enable --trust_remote_code, so we can close that PR by merging this one into main.

bfineran · 2023-07-07T17:52:12Z

failing tests are unrelated - investigating separately but unable to reproduce outside of GHA. Merging this PR to unblock functionality

eldarkurtic and others added 9 commits June 16, 2023 10:05

Expose trust_remote_code flag for HF-transformers

aa08b19

Reload big model with multiple state dict files

80620f1

Add description for reload func

d19c550

Merge branch 'main' into load_truly_LLMs

44abc6b

Merge branch 'main' into load_truly_LLMs

a952e94

Merge branch 'main' into fix-HF-export

b908e3a

Merge branch 'fix-HF-export' of github.com:eldarkurtic/sparseml

2cd5a50

Merge branch 'main' of github.com:neuralmagic/sparseml

537afd2

handle new HF interface

e066230

bfineran previously approved these changes Jul 6, 2023

View reviewed changes

eldarkurtic dismissed bfineran’s stale review via 4de9f63 July 6, 2023 20:27

eldarkurtic force-pushed the fix-newHF-quant-export branch 2 times, most recently from 4de9f63 to e066230 Compare July 6, 2023 20:28

eldarkurtic requested review from bfineran, a team, KSGulin and robertgshaw2-redhat and removed request for a team July 6, 2023 20:36

bfineran approved these changes Jul 7, 2023

View reviewed changes

bfineran merged commit 4ec5133 into neuralmagic:main Jul 7, 2023

eldarkurtic mentioned this pull request Jul 7, 2023

[Required for MPT models] Expose trust_remote_code flag for HF-transformers #1630

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix export of all quantized transformer models #1654

Fix export of all quantized transformer models #1654

Uh oh!

eldarkurtic commented Jul 6, 2023

Uh oh!

bfineran left a comment

Uh oh!

eldarkurtic commented Jul 6, 2023 •

edited

Loading

Uh oh!

bfineran commented Jul 7, 2023

Uh oh!

Uh oh!

Fix export of all quantized transformer models #1654

Fix export of all quantized transformer models #1654

Uh oh!

Conversation

eldarkurtic commented Jul 6, 2023

Uh oh!

bfineran left a comment

Choose a reason for hiding this comment

Uh oh!

eldarkurtic commented Jul 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bfineran commented Jul 7, 2023

Uh oh!

Uh oh!

eldarkurtic commented Jul 6, 2023 •

edited

Loading