Skip to content

微调Intervl2-8B后,使用webui推理一直无输出,且没报错 #3944

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
MaTengSYSU opened this issue Apr 21, 2025 · 1 comment
Open

Comments

@MaTengSYSU
Copy link

Describe the bug
微调Intervl2-8B后,使用webui推理一直无输出,且没报错
输入文本和图片后,点击发送就一直在加载,没有输出

如图所示:

Image

Image

Image

日志输出结果如下:

run sh: `/home/mateng/anaconda3/envs/ms-swift/bin/python /mnt/sda1/mateng/ms-swift/swift/cli/deploy.py --model_type internvl2 --template internvl2 --max_new_tokens 512 --temperature 0.3 --top_k 20 --top_p 0.7 --repetition_penalty 1.05 --system 你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型,英文名叫InternVL, 是一个有用无害的人工智能助手。 --ckpt_dir /mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged --train_type lora --port 8000 --log_file /mnt/sda1/mateng/ms-swift/output/internvl2-2025420224143/run_deploy.log --ignore_args_error true`
[INFO:swift] Successfully registered `/mnt/sda1/mateng/ms-swift/swift/llm/dataset/data/dataset_info.json`.
[WARNING:swift] The `--ckpt_dir` parameter will be removed in `ms-swift>=3.4`. Please use `--model`, `--adapters`.
[INFO:swift] Successfully loaded /mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged/args.json.
[INFO:swift] rank: -1, local_rank: -1, world_size: 1, local_world_size: 1
[INFO:swift] Loading the model using model_dir: /mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged
[INFO:swift] args.result_path: /mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged/deploy_result/20250420-224150.jsonl
[WARNING:swift] remaining_argv: ['--log_file', '/mnt/sda1/mateng/ms-swift/output/internvl2-2025420224143/run_deploy.log']
[INFO:swift] Global seed set to 42
[INFO:swift] args: DeployArguments(model='/mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged', model_type='internvl2', model_revision=None, task_type='causal_lm', torch_dtype=torch.bfloat16, attn_impl='flash_attn', num_labels=None, problem_type=None, rope_scaling=None, device_map=None, max_memory={}, local_repo_path=None, template='internvl2', system='你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型,英文名叫InternVL, 是一个有用无害的人工智能助手。', max_length=None, truncation_strategy='delete', max_pixels=None, tools_prompt='react_en', norm_bbox=None, response_prefix=None, padding_side='right', loss_scale='default', sequence_parallel_size=1, use_chat_template=True, template_backend='swift', dataset=[], val_dataset=[], split_dataset_ratio=0.01, data_seed=42, dataset_num_proc=1, dataset_shuffle=True, val_dataset_shuffle=False, streaming=False, interleave_prob=None, stopping_strategy='first_exhausted', shuffle_buffer_size=1000, enable_cache=False, download_mode='reuse_dataset_if_exists', columns={}, strict=False, remove_unused_columns=True, model_name=[None, None], model_author=[None, None], custom_dataset_info=[], quant_method=None, quant_bits=None, hqq_axis=None, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, bnb_4bit_quant_storage=None, max_new_tokens=512, temperature=0.3, top_k=20, top_p=0.7, repetition_penalty=1.05, num_beams=1, stream=False, stop_words=[], logprobs=False, top_logprobs=None, ckpt_dir='/mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged', load_dataset_config=None, lora_modules=[], tuner_backend='peft', train_type='lora', adapters=[], external_plugins=[], seed=42, model_kwargs={}, load_args=True, load_data_args=False, use_hf=False, hub_token=None, custom_register_path=[], ignore_args_error=True, use_swift_lora=False, tp=1, session_len=None, cache_max_entry_count=0.8, quant_policy=0, vision_batch_size=1, gpu_memory_utilization=0.9, tensor_parallel_size=1, pipeline_parallel_size=1, max_num_seqs=256, max_model_len=None, disable_custom_all_reduce=False, enforce_eager=False, limit_mm_per_prompt={}, vllm_max_lora_rank=16, enable_prefix_caching=False, merge_lora=False, safe_serialization=True, max_shard_size='5GB', infer_backend='pt', result_path='/mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged/deploy_result/20250420-224150.jsonl', metric=None, max_batch_size=1, ddp_backend=None, val_dataset_sample=None, host='0.0.0.0', port=8000, api_key=None, ssl_keyfile=None, ssl_certfile=None, owned_by='swift', served_model_name=None, verbose=True, log_interval=20, max_logprobs=20)
[INFO:swift] Loading the model using model_dir: /mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged
[INFO:swift] model_kwargs: {'device_map': 'cuda:0'}
/home/mateng/anaconda3/envs/ms-swift/lib/python3.10/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
  warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning)
InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]
Loading checkpoint shards:  25%|██▌       | 1/4 [00:00<00:02,  1.02it/s]
Loading checkpoint shards:  50%|█████     | 2/4 [00:01<00:01,  1.26it/s]
Loading checkpoint shards:  75%|███████▌  | 3/4 [00:02<00:00,  1.37it/s]
Loading checkpoint shards: 100%|██████████| 4/4 [00:02<00:00,  1.94it/s]
Loading checkpoint shards: 100%|██████████| 4/4 [00:02<00:00,  1.61it/s]
[INFO:swift] default_system: 你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型,英文名叫InternVL, 是一个有用无害的人工智能助手。
[INFO:swift] model: InternVLChatModel(
  (vision_model): InternVisionModel(
    (embeddings): InternVisionEmbeddings(
      (patch_embedding): Conv2d(3, 1024, kernel_size=(14, 14), stride=(14, 14))
    )
    (encoder): InternVisionEncoder(
      (layers): ModuleList(
        (0-23): 24 x InternVisionEncoderLayer(
          (attn): InternAttention(
            (qkv): Linear(in_features=1024, out_features=3072, bias=True)
            (attn_drop): Dropout(p=0.0, inplace=False)
            (proj_drop): Dropout(p=0.0, inplace=False)
            (inner_attn): FlashAttention()
            (proj): Linear(in_features=1024, out_features=1024, bias=True)
          )
          (mlp): InternMLP(
            (act): GELUActivation()
            (fc1): Linear(in_features=1024, out_features=4096, bias=True)
            (fc2): Linear(in_features=4096, out_features=1024, bias=True)
          )
          (norm1): LayerNorm((1024,), eps=1e-06, elementwise_affine=True)
          (norm2): LayerNorm((1024,), eps=1e-06, elementwise_affine=True)
          (drop_path1): Identity()
          (drop_path2): Identity()
        )
      )
    )
  )
  (language_model): InternLM2ForCausalLM(
    (model): InternLM2Model(
      (tok_embeddings): Embedding(92553, 4096, padding_idx=2)
      (layers): ModuleList(
        (0-31): 32 x InternLM2DecoderLayer(
          (attention): InternLM2FlashAttention2(
            (wqkv): Linear(in_features=4096, out_features=6144, bias=False)
            (wo): Linear(in_features=4096, out_features=4096, bias=False)
            (rotary_emb): InternLM2DynamicNTKScalingRotaryEmbedding()
          )
          (feed_forward): InternLM2MLP(
            (w1): Linear(in_features=4096, out_features=14336, bias=False)
            (w3): Linear(in_features=4096, out_features=14336, bias=False)
            (w2): Linear(in_features=14336, out_features=4096, bias=False)
            (act_fn): SiLU()
          )
          (attention_norm): InternLM2RMSNorm()
          (ffn_norm): InternLM2RMSNorm()
        )
      )
      (norm): InternLM2RMSNorm()
    )
    (output): Linear(in_features=4096, out_features=92553, bias=False)
  )
  (mlp1): Sequential(
    (0): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
    (1): Linear(in_features=4096, out_features=4096, bias=True)
    (2): GELU(approximate='none')
    (3): Linear(in_features=4096, out_features=4096, bias=True)
  )
)
[INFO:swift] Start time of running main: 2025-04-20 22:41:54.192437
[INFO:swift] model_list: ['checkpoint-500-merged']
INFO:     Started server process [4038634]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.05399824, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02023249, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02024324, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022246, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.00432876, 'samples/s': 0.0, 'tokens/s': 0.0}
INFO:     127.0.0.1:35920 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:35920 - "GET /favicon.ico HTTP/1.1" 404 Not Found
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022796, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02025343, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022664, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02024512, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022405, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02024439, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022376, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.00279834, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.020243, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022321, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02024113, 'samples/s': 0.0, 'tokens/s': 0.0}

@Jintao-Huang
Copy link
Collaborator

试试使用swift app

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants