Support internlm 2.5 model. #14

pdx1989 · 2024-08-22T06:10:52Z

Support internlm 2.5 model.

* support ascend using infer_ext * fix(ascend): make infer_ext using TND format q,k,v in paged_token_attention * support ascend using infer_ext * feat: support ascend moe_gating_topk_softmax * feat: change infer_ext ops function param order (#2) * ascend: align attention mask to 32bytes (#7) * fix attn args (#9) * fix: expand shape of attn_mask (#10) * feat: udpate infer_ext ops interface (#13) * rename infer_ext to dlinfer * format code * Support internlm 2.5 (#14) * refactor ascend pagedattention * fix ascend apply_rotary_pos_emb * fix import dlinfer (#16) * fix: fix rms_norm params (#18) * fix sync on ascend --------- Co-authored-by: chenchiyu <[email protected]> Co-authored-by: CyCle1024 <[email protected]> Co-authored-by: Wei Tao <[email protected]> Co-authored-by: jinminxi104 <[email protected]> Co-authored-by: pdx1989 <[email protected]>

Support internlm 2.5

4aaaa0c

pdx1989 requested review from jinminxi104 and Reinerzhou August 22, 2024 06:14

jinminxi104 merged commit 51ec61c into infer_ext Aug 22, 2024

yao-fengchen pushed a commit that referenced this pull request Aug 22, 2024

Support internlm 2.5 (#14)

ef4d9c8

jinminxi104 pushed a commit that referenced this pull request Aug 22, 2024

Support internlm 2.5 (#14)

f02faaa

yao-fengchen pushed a commit that referenced this pull request Aug 23, 2024

Support internlm 2.5 (#14)

41e1398

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support internlm 2.5 model. #14

Support internlm 2.5 model. #14

Uh oh!

pdx1989 commented Aug 22, 2024

Uh oh!

Uh oh!

Support internlm 2.5 model. #14

Support internlm 2.5 model. #14

Uh oh!

Conversation

pdx1989 commented Aug 22, 2024

Uh oh!

Uh oh!