微调Intervl2-8B后，使用webui推理一直无输出，且没报错 #3944

MaTengSYSU · 2025-04-21T02:47:33Z

Describe the bug
微调Intervl2-8B后，使用webui推理一直无输出，且没报错
输入文本和图片后，点击发送就一直在加载，没有输出

如图所示：

日志输出结果如下：

run sh: `/home/mateng/anaconda3/envs/ms-swift/bin/python /mnt/sda1/mateng/ms-swift/swift/cli/deploy.py --model_type internvl2 --template internvl2 --max_new_tokens 512 --temperature 0.3 --top_k 20 --top_p 0.7 --repetition_penalty 1.05 --system 你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型，英文名叫InternVL, 是一个有用无害的人工智能助手。 --ckpt_dir /mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged --train_type lora --port 8000 --log_file /mnt/sda1/mateng/ms-swift/output/internvl2-2025420224143/run_deploy.log --ignore_args_error true`
[INFO:swift] Successfully registered `/mnt/sda1/mateng/ms-swift/swift/llm/dataset/data/dataset_info.json`.
[WARNING:swift] The `--ckpt_dir` parameter will be removed in `ms-swift>=3.4`. Please use `--model`, `--adapters`.
[INFO:swift] Successfully loaded /mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged/args.json.
[INFO:swift] rank: -1, local_rank: -1, world_size: 1, local_world_size: 1
[INFO:swift] Loading the model using model_dir: /mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged
[INFO:swift] args.result_path: /mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged/deploy_result/20250420-224150.jsonl
[WARNING:swift] remaining_argv: ['--log_file', '/mnt/sda1/mateng/ms-swift/output/internvl2-2025420224143/run_deploy.log']
[INFO:swift] Global seed set to 42
[INFO:swift] args: DeployArguments(model='/mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged', model_type='internvl2', model_revision=None, task_type='causal_lm', torch_dtype=torch.bfloat16, attn_impl='flash_attn', num_labels=None, problem_type=None, rope_scaling=None, device_map=None, max_memory={}, local_repo_path=None, template='internvl2', system='你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型，英文名叫InternVL, 是一个有用无害的人工智能助手。', max_length=None, truncation_strategy='delete', max_pixels=None, tools_prompt='react_en', norm_bbox=None, response_prefix=None, padding_side='right', loss_scale='default', sequence_parallel_size=1, use_chat_template=True, template_backend='swift', dataset=[], val_dataset=[], split_dataset_ratio=0.01, data_seed=42, dataset_num_proc=1, dataset_shuffle=True, val_dataset_shuffle=False, streaming=False, interleave_prob=None, stopping_strategy='first_exhausted', shuffle_buffer_size=1000, enable_cache=False, download_mode='reuse_dataset_if_exists', columns={}, strict=False, remove_unused_columns=True, model_name=[None, None], model_author=[None, None], custom_dataset_info=[], quant_method=None, quant_bits=None, hqq_axis=None, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, bnb_4bit_quant_storage=None, max_new_tokens=512, temperature=0.3, top_k=20, top_p=0.7, repetition_penalty=1.05, num_beams=1, stream=False, stop_words=[], logprobs=False, top_logprobs=None, ckpt_dir='/mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged', load_dataset_config=None, lora_modules=[], tuner_backend='peft', train_type='lora', adapters=[], external_plugins=[], seed=42, model_kwargs={}, load_args=True, load_data_args=False, use_hf=False, hub_token=None, custom_register_path=[], ignore_args_error=True, use_swift_lora=False, tp=1, session_len=None, cache_max_entry_count=0.8, quant_policy=0, vision_batch_size=1, gpu_memory_utilization=0.9, tensor_parallel_size=1, pipeline_parallel_size=1, max_num_seqs=256, max_model_len=None, disable_custom_all_reduce=False, enforce_eager=False, limit_mm_per_prompt={}, vllm_max_lora_rank=16, enable_prefix_caching=False, merge_lora=False, safe_serialization=True, max_shard_size='5GB', infer_backend='pt', result_path='/mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged/deploy_result/20250420-224150.jsonl', metric=None, max_batch_size=1, ddp_backend=None, val_dataset_sample=None, host='0.0.0.0', port=8000, api_key=None, ssl_keyfile=None, ssl_certfile=None, owned_by='swift', served_model_name=None, verbose=True, log_interval=20, max_logprobs=20)
[INFO:swift] Loading the model using model_dir: /mnt/sda1/mateng/ms-swift/output/v0-20250420-074407/checkpoint-500-merged
[INFO:swift] model_kwargs: {'device_map': 'cuda:0'}
/home/mateng/anaconda3/envs/ms-swift/lib/python3.10/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
  warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning)
InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]
Loading checkpoint shards:  25%|██▌       | 1/4 [00:00<00:02,  1.02it/s]
Loading checkpoint shards:  50%|█████     | 2/4 [00:01<00:01,  1.26it/s]
Loading checkpoint shards:  75%|███████▌  | 3/4 [00:02<00:00,  1.37it/s]
Loading checkpoint shards: 100%|██████████| 4/4 [00:02<00:00,  1.94it/s]
Loading checkpoint shards: 100%|██████████| 4/4 [00:02<00:00,  1.61it/s]
[INFO:swift] default_system: 你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型，英文名叫InternVL, 是一个有用无害的人工智能助手。
[INFO:swift] model: InternVLChatModel(
  (vision_model): InternVisionModel(
    (embeddings): InternVisionEmbeddings(
      (patch_embedding): Conv2d(3, 1024, kernel_size=(14, 14), stride=(14, 14))
    )
    (encoder): InternVisionEncoder(
      (layers): ModuleList(
        (0-23): 24 x InternVisionEncoderLayer(
          (attn): InternAttention(
            (qkv): Linear(in_features=1024, out_features=3072, bias=True)
            (attn_drop): Dropout(p=0.0, inplace=False)
            (proj_drop): Dropout(p=0.0, inplace=False)
            (inner_attn): FlashAttention()
            (proj): Linear(in_features=1024, out_features=1024, bias=True)
          )
          (mlp): InternMLP(
            (act): GELUActivation()
            (fc1): Linear(in_features=1024, out_features=4096, bias=True)
            (fc2): Linear(in_features=4096, out_features=1024, bias=True)
          )
          (norm1): LayerNorm((1024,), eps=1e-06, elementwise_affine=True)
          (norm2): LayerNorm((1024,), eps=1e-06, elementwise_affine=True)
          (drop_path1): Identity()
          (drop_path2): Identity()
        )
      )
    )
  )
  (language_model): InternLM2ForCausalLM(
    (model): InternLM2Model(
      (tok_embeddings): Embedding(92553, 4096, padding_idx=2)
      (layers): ModuleList(
        (0-31): 32 x InternLM2DecoderLayer(
          (attention): InternLM2FlashAttention2(
            (wqkv): Linear(in_features=4096, out_features=6144, bias=False)
            (wo): Linear(in_features=4096, out_features=4096, bias=False)
            (rotary_emb): InternLM2DynamicNTKScalingRotaryEmbedding()
          )
          (feed_forward): InternLM2MLP(
            (w1): Linear(in_features=4096, out_features=14336, bias=False)
            (w3): Linear(in_features=4096, out_features=14336, bias=False)
            (w2): Linear(in_features=14336, out_features=4096, bias=False)
            (act_fn): SiLU()
          )
          (attention_norm): InternLM2RMSNorm()
          (ffn_norm): InternLM2RMSNorm()
        )
      )
      (norm): InternLM2RMSNorm()
    )
    (output): Linear(in_features=4096, out_features=92553, bias=False)
  )
  (mlp1): Sequential(
    (0): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
    (1): Linear(in_features=4096, out_features=4096, bias=True)
    (2): GELU(approximate='none')
    (3): Linear(in_features=4096, out_features=4096, bias=True)
  )
)
[INFO:swift] Start time of running main: 2025-04-20 22:41:54.192437
[INFO:swift] model_list: ['checkpoint-500-merged']
INFO:     Started server process [4038634]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.05399824, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02023249, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02024324, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022246, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.00432876, 'samples/s': 0.0, 'tokens/s': 0.0}
INFO:     127.0.0.1:35920 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:35920 - "GET /favicon.ico HTTP/1.1" 404 Not Found
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022796, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02025343, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022664, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02024512, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022405, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02024439, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022376, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.00279834, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.020243, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02022321, 'samples/s': 0.0, 'tokens/s': 0.0}
[INFO:swift] {'num_prompt_tokens': 0, 'num_generated_tokens': 0, 'num_samples': 0, 'runtime': 20.02024113, 'samples/s': 0.0, 'tokens/s': 0.0}

The text was updated successfully, but these errors were encountered:

Jintao-Huang · 2025-04-23T06:14:32Z

试试使用swift app

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

微调Intervl2-8B后，使用webui推理一直无输出，且没报错 #3944

微调Intervl2-8B后，使用webui推理一直无输出，且没报错 #3944

MaTengSYSU commented Apr 21, 2025

Jintao-Huang commented Apr 23, 2025

微调Intervl2-8B后，使用webui推理一直无输出，且没报错 #3944

微调Intervl2-8B后，使用webui推理一直无输出，且没报错 #3944

Comments

MaTengSYSU commented Apr 21, 2025

Jintao-Huang commented Apr 23, 2025