[new blog post] introducing auto-round #2826

wenhuach21 · 2025-04-23T02:01:25Z

Congratulations! You've made it this far! Once merged, the article will appear at https://huggingface.co/blog. Official articles
require additional reviews. Alternatively, you can write a community article following the process here.

Preparing the Article

You're not quite done yet, though. Please make sure to follow this process (as documented here):

Add an entry to _blog.yml.
Add a thumbnail. There are no requirements here, but there is a template if it's helpful.
Check you use a short title and blog path.
Upload any additional assets (such as images) to the Documentation Images repo. This is to reduce bloat in the GitHub base repo when cloning and pulling. Try to have small images to avoid a slow or expensive user experience.
Add metadata (such as authors) to your md file. You can also specify guest or org for the authors.
Ensure the publication date is correct.
Preview the content. A quick way is to paste the markdown content in https://huggingface.co/new-blog. Do not click publish, this is just a way to do an early check.

Here is an example of a complete PR: #2382

Getting a Review

Please make sure to get a review from someone on your team or a co-author.
Once this is done and once all the steps above are completed, you should be able to merge.
There is no need for additional reviews if you and your co-authors are happy and meet all of the above.

Feel free to add @pcuenca as a reviewer if you want a final check. Keep in mind he'll be biased toward light reviews
(e.g., check for proper metadata) rather than content reviews unless explicitly asked.

hshen14 · 2025-04-24T13:32:54Z

@IlyasMoutawwakil @SunMarc Please help review the PR. Thanks.

SunMarc · 2025-04-24T14:13:19Z

cc @MekkCyber

MekkCyber

Great blogpost 🔥 !

Small nit : for images other than the thumbnail, it's better to keep them here : https://huggingface.co/datasets/huggingface/documentation-images, you can open a pr, add them, and ping me or someone else to merge it.

Example of how to use the images after adding them : https://github.com/huggingface/blog/blob/main/1_58_llm_extreme_quantization.md

MekkCyber · 2025-04-24T14:14:37Z

_blog.yml

+  date: April 23, 2025
+  tags:
+   - llms
+   - inference
+   - quantization
+   - intel


reminder to change the date before publishing

assets/autoround/int2.png

autoround.md

IlyasMoutawwakil

Looks good !
There are many line unnecessary line breaks (in the middle of sentences), probably from copy-pasting.

SunMarc

Thanks for this blogpost ! One minor concern I have is that this is a bit too detailed sometimes, feel a bit too much like documentation. It could be interesting to make it a bit lighter (less snippets) and link transformers docs for more details. WDTY ?

autoround.md

SunMarc · 2025-04-24T14:36:59Z

autoround.md

+# What is AutoRound?
+
+**AutoRound** is a weight-only post-training quantization (PTQ) method developed by Intel. It uses signed gradient
+descent to jointly optimize weight rounding and clipping ranges, enabling accurate low-bit quantization (e.g.,
+INT2 - INT8) with minimal accuracy loss in most scenarios. For example, at INT2, it outperforms popular baselines by up to **2.1x higher in relative accuracy**.
+
+Despite its strong performance, AutoRound is fast and lightweight — quantizing a 72B model takes just **37 minutes on an
+A100 GPU** under light mode. It also supports mixed-bit tuning, lm-head quantization, GPTQ/AWQ/GGUF format exporting, and flexible tuning recipes.


I feel like it could be a nice addition to give more details on how auto-round quantization work !

To explain the AutoRound algorithm in detail, some background and mathematical concepts need to be introduced.
To simplify this blog, I’ve included an image that provides an overview of the algorithm and recommend readers refer to our paper for more details.

Co-authored-by: Marc Sun <[email protected]>

wenhuach21 · 2025-04-25T07:09:06Z

Thanks for this blogpost ! One minor concern I have is that this is a bit too detailed sometimes, feel a bit too much like documentation. It could be interesting to make it a bit lighter (less snippets) and link transformers docs for more details. WDTY ?

I've removed some details. Please check if there are any other parts that need to be deleted/refined

wenhuach21 · 2025-04-25T07:19:47Z

Great blogpost 🔥 !

Small nit : for images other than the thumbnail, it's better to keep them here : https://huggingface.co/datasets/huggingface/documentation-images, you can open a pr, add them, and ping me or someone else to merge it.

Example of how to use the images after adding them : https://github.com/huggingface/blog/blob/main/1_58_llm_extreme_quantization.md

Thanks, the pr is here. I will switch to these images after it merged
https://huggingface.co/datasets/huggingface/documentation-images/discussions/482

MekkCyber · 2025-04-25T07:31:16Z

Thanks @wenhuach21 ! Merged : https://huggingface.co/datasets/huggingface/documentation-images/tree/main/auto-round

wenhuach21 · 2025-04-25T09:32:12Z

Most issues have been addressed. Please review it again when you have a moment.

SunMarc

Thanks for iterating ! This is much better. Let us know when you want this merged !

autoround.md

wenhuach21 · 2025-04-29T01:35:31Z

Thanks for iterating ! This is much better. Let us know when you want this merged !

Everything is ready. Could you kindly merge it at your convenience? Thank you in advance!

hshen14 · 2025-04-29T08:25:24Z

Thanks for iterating ! This is much better. Let us know when you want this merged !

Everything is ready. Could you kindly merge it at your convenience? Thank you in advance!

@IlyasMoutawwakil @MekkCyber please review and approve. Thanks.

hshen14 · 2025-04-29T10:51:21Z

Thank you @wenhuach21 @SunMarc @IlyasMoutawwakil @MekkCyber !

julien-c · 2025-05-02T12:12:06Z

don't forget to link authors to the Intel org like i've done in dc8c591

This brings more visibility to the organization, and displays the blogpost on the org page:

wenhuach21 and others added 10 commits April 18, 2025 15:32

update

f3ae34d

update

62a398d

tiny change

fa8474b

tiny change

26d73f3

Fix typos and update wording

a1009f7

Update the #tasks

2e7ae20

Update the device section

5d3d2af

update authors

5881ae4

add info in yml

5c7a857

add thumbnail

3f0dad0

MekkCyber reviewed Apr 24, 2025

View reviewed changes

IlyasMoutawwakil reviewed Apr 24, 2025

View reviewed changes

assets/autoround/int2.png Outdated Show resolved Hide resolved

IlyasMoutawwakil reviewed Apr 24, 2025

View reviewed changes

autoround.md Outdated Show resolved Hide resolved

IlyasMoutawwakil reviewed Apr 24, 2025

View reviewed changes

SunMarc reviewed Apr 24, 2025

View reviewed changes

wenhuach21 and others added 8 commits April 25, 2025 13:10

Update autoround.md

a0127ec

Co-authored-by: Marc Sun <[email protected]>

fix some issues

7ea8ea1

fix some issues

8fc63b3

fix splitting line issue

042e801

fix some issues

1e5f2b1

update a little

44e5a4f

add overview

f780b36

update

1b005e2

wenhuach21 added 2 commits April 25, 2025 15:46

switch image source

528de3d

resize image

2f0c243

tiny change

f88bbf3

SunMarc approved these changes Apr 25, 2025

View reviewed changes

kding1 suggested changes Apr 28, 2025

View reviewed changes

autoround.md Outdated Show resolved Hide resolved

autoround.md Outdated Show resolved Hide resolved

wenhuach21 added 2 commits April 29, 2025 09:28

change to intel gpu

224263a

Merge branch 'main' into main

b86078b

wenhuach21 added 2 commits April 29, 2025 11:19

change opt-125m to qwen3

b4494e7

Merge branch 'main' of https://github.com/wenhuach21/blog

5c84b30

SunMarc merged commit 4ddb31a into huggingface:main Apr 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[new blog post] introducing auto-round #2826

[new blog post] introducing auto-round #2826

wenhuach21 commented Apr 23, 2025 •

edited

Loading

hshen14 commented Apr 24, 2025

SunMarc commented Apr 24, 2025

MekkCyber left a comment •

edited

Loading

MekkCyber Apr 24, 2025

IlyasMoutawwakil left a comment

SunMarc left a comment

SunMarc Apr 24, 2025

wenhuach21 Apr 25, 2025

wenhuach21 commented Apr 25, 2025

wenhuach21 commented Apr 25, 2025

MekkCyber commented Apr 25, 2025

wenhuach21 commented Apr 25, 2025

SunMarc left a comment •

edited

Loading

wenhuach21 commented Apr 29, 2025

hshen14 commented Apr 29, 2025

hshen14 commented Apr 29, 2025

julien-c commented May 2, 2025

[new blog post] introducing auto-round #2826

[new blog post] introducing auto-round #2826

Conversation

wenhuach21 commented Apr 23, 2025 • edited Loading

Preparing the Article

Getting a Review

hshen14 commented Apr 24, 2025

SunMarc commented Apr 24, 2025

MekkCyber left a comment • edited Loading

Choose a reason for hiding this comment

MekkCyber Apr 24, 2025

Choose a reason for hiding this comment

IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

SunMarc left a comment

Choose a reason for hiding this comment

SunMarc Apr 24, 2025

Choose a reason for hiding this comment

wenhuach21 Apr 25, 2025

Choose a reason for hiding this comment

wenhuach21 commented Apr 25, 2025

wenhuach21 commented Apr 25, 2025

MekkCyber commented Apr 25, 2025

wenhuach21 commented Apr 25, 2025

SunMarc left a comment • edited Loading

Choose a reason for hiding this comment

wenhuach21 commented Apr 29, 2025

hshen14 commented Apr 29, 2025

hshen14 commented Apr 29, 2025

julien-c commented May 2, 2025

wenhuach21 commented Apr 23, 2025 •

edited

Loading

MekkCyber left a comment •

edited

Loading

SunMarc left a comment •

edited

Loading