Skip to content

[new blog post] introducing auto-round #2826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Apr 29, 2025
Merged

Conversation

wenhuach21
Copy link
Contributor

@wenhuach21 wenhuach21 commented Apr 23, 2025

Congratulations! You've made it this far! Once merged, the article will appear at https://huggingface.co/blog. Official articles
require additional reviews. Alternatively, you can write a community article following the process here.

Preparing the Article

You're not quite done yet, though. Please make sure to follow this process (as documented here):

  • Add an entry to _blog.yml.
  • Add a thumbnail. There are no requirements here, but there is a template if it's helpful.
  • Check you use a short title and blog path.
  • Upload any additional assets (such as images) to the Documentation Images repo. This is to reduce bloat in the GitHub base repo when cloning and pulling. Try to have small images to avoid a slow or expensive user experience.
  • Add metadata (such as authors) to your md file. You can also specify guest or org for the authors.
  • Ensure the publication date is correct.
  • Preview the content. A quick way is to paste the markdown content in https://huggingface.co/new-blog. Do not click publish, this is just a way to do an early check.

Here is an example of a complete PR: #2382

Getting a Review

Please make sure to get a review from someone on your team or a co-author.
Once this is done and once all the steps above are completed, you should be able to merge.
There is no need for additional reviews if you and your co-authors are happy and meet all of the above.

Feel free to add @pcuenca as a reviewer if you want a final check. Keep in mind he'll be biased toward light reviews
(e.g., check for proper metadata) rather than content reviews unless explicitly asked.

@hshen14
Copy link
Contributor

hshen14 commented Apr 24, 2025

@IlyasMoutawwakil @SunMarc Please help review the PR. Thanks.

@SunMarc
Copy link
Member

SunMarc commented Apr 24, 2025

cc @MekkCyber

Copy link
Contributor

@MekkCyber MekkCyber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great blogpost 🔥 !

Small nit : for images other than the thumbnail, it's better to keep them here : https://huggingface.co/datasets/huggingface/documentation-images, you can open a pr, add them, and ping me or someone else to merge it.

Example of how to use the images after adding them : https://github.com/huggingface/blog/blob/main/1_58_llm_extreme_quantization.md

_blog.yml Outdated
Comment on lines 5907 to 5912
date: April 23, 2025
tags:
- llms
- inference
- quantization
- intel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder to change the date before publishing

Copy link
Member

@IlyasMoutawwakil IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good !
There are many line unnecessary line breaks (in the middle of sentences), probably from copy-pasting.

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this blogpost ! One minor concern I have is that this is a bit too detailed sometimes, feel a bit too much like documentation. It could be interesting to make it a bit lighter (less snippets) and link transformers docs for more details. WDTY ?

autoround.md Outdated
Comment on lines 22 to 29
# What is AutoRound?

**AutoRound** is a weight-only post-training quantization (PTQ) method developed by Intel. It uses signed gradient
descent to jointly optimize weight rounding and clipping ranges, enabling accurate low-bit quantization (e.g.,
INT2 - INT8) with minimal accuracy loss in most scenarios. For example, at INT2, it outperforms popular baselines by up to **2.1x higher in relative accuracy**.

Despite its strong performance, AutoRound is fast and lightweight — quantizing a 72B model takes just **37 minutes on an
A100 GPU** under light mode. It also supports mixed-bit tuning, lm-head quantization, GPTQ/AWQ/GGUF format exporting, and flexible tuning recipes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it could be a nice addition to give more details on how auto-round quantization work !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To explain the AutoRound algorithm in detail, some background and mathematical concepts need to be introduced.
To simplify this blog, I’ve included an image that provides an overview of the algorithm and recommend readers refer to our paper for more details.

@wenhuach21
Copy link
Contributor Author

Thanks for this blogpost ! One minor concern I have is that this is a bit too detailed sometimes, feel a bit too much like documentation. It could be interesting to make it a bit lighter (less snippets) and link transformers docs for more details. WDTY ?

I've removed some details. Please check if there are any other parts that need to be deleted/refined

@wenhuach21
Copy link
Contributor Author

Great blogpost 🔥 !

Small nit : for images other than the thumbnail, it's better to keep them here : https://huggingface.co/datasets/huggingface/documentation-images, you can open a pr, add them, and ping me or someone else to merge it.

Example of how to use the images after adding them : https://github.com/huggingface/blog/blob/main/1_58_llm_extreme_quantization.md

Thanks, the pr is here. I will switch to these images after it merged
https://huggingface.co/datasets/huggingface/documentation-images/discussions/482

@MekkCyber
Copy link
Contributor

Thanks @wenhuach21 ! Merged : https://huggingface.co/datasets/huggingface/documentation-images/tree/main/auto-round

@wenhuach21
Copy link
Contributor Author

Most issues have been addressed. Please review it again when you have a moment.

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating ! This is much better. Let us know when you want this merged !

@wenhuach21
Copy link
Contributor Author

Thanks for iterating ! This is much better. Let us know when you want this merged !

Everything is ready. Could you kindly merge it at your convenience? Thank you in advance!

@hshen14
Copy link
Contributor

hshen14 commented Apr 29, 2025

Thanks for iterating ! This is much better. Let us know when you want this merged !

Everything is ready. Could you kindly merge it at your convenience? Thank you in advance!

@IlyasMoutawwakil @MekkCyber please review and approve. Thanks.

@SunMarc SunMarc merged commit 4ddb31a into huggingface:main Apr 29, 2025
@hshen14
Copy link
Contributor

hshen14 commented Apr 29, 2025

Thank you @wenhuach21 @SunMarc @IlyasMoutawwakil @MekkCyber !

@julien-c
Copy link
Member

julien-c commented May 2, 2025

don't forget to link authors to the Intel org like i've done in dc8c591

This brings more visibility to the organization, and displays the blogpost on the org page:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants