This repository was archived by the owner on Oct 25, 2024. It is now read-only.

Description
model.cpp: loading model from runtime_outs/ne_qwen2_q_autoround.bin
The number of ne_parameters is wrong.
init: n_vocab = 151936
init: n_embd = 1536
init: n_mult = 8960
init: n_head = 12
init: n_head_kv = 0
init: n_layer = 28
init: n_rot = 128
init: ftype = 0
init: max_seq_len= 32768
init: n_ff = 8960
init: n_parts = 1
MODEL_ASSERT: /root/w0/workspace/neuralspeed-wheel-build/nlp_repo/neural_speed/./models/qwen/qwen.h:48: false
/tmp/tmp9b4073w1: line 3: 55575 Aborted python /home/lmf/llm/Qwen2-finetuning/awq_intel_extension.py
ERROR conda.cli.main_run:execute(124): conda run python /home/lmf/llm/Qwen2-finetuning/awq_intel_extension.py
failed. (See above for error)