Skip to content

Wt/camb/pr #33

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
Closed

Wt/camb/pr #33

wants to merge 12 commits into from

Conversation

JackWeiw
Copy link

Motivation

This PR adds support for camb dlinfer in both graph and eager mode.

JackWeiw and others added 12 commits December 30, 2024 11:13
* support w8a8 smooth_quant and loading

* optimize int8

* fix fp8 kernels

* update docs for w8a8

* resolve comments

* resolve comments

* fix ut

* disable not quant last norm

* disable quant last norm for cogvlm and minicpmv26 models

---------

Co-authored-by: grimoire <[email protected]>
* first

* better tuning

* restore tuning value
* remove threadsafe

* optimize performance

* 22.4

* 22.5

* delete jsonl

* add docs

* fix link

* rst

* remove sleep req step

* remove scheduler sleep

* fix ut

* recovery async engine
* Update ascend get_started.md

* Update ascend get_started.md

* fix Dockerfile_aarch64_ascend
@JackWeiw JackWeiw closed this Jan 14, 2025
@JackWeiw JackWeiw deleted the wt/camb/pr branch January 14, 2025 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants