We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug I'm finetuning Qwen2.5-3B-Instruct but encounter a very slow finetuning process.
Step to reproduce
git clone https://github.com/modelscope/ms-swift.git cd ms-swift/ pip install -v -e . pip install polars polars-lts-cpu deepspeed wandb datasets
Run python gen_data.py
python gen_data.py
import polars as pl from tqdm import tqdm from datasets import load_dataset data = load_dataset( "BlossomsAI/reduced_vietnamese_instruction_dataset", split="train", cache_dir="cache_data", ) results = [] for d in tqdm(data, total=len(data)): # print(d) r = { "instruction": d["instruction"], "input": d["input"], "output": d["output"], } results.append(r) df = pl.DataFrame(results) df.write_ndjson("data.jsonl")
Run bash sft_qwen2_5_3b.sh
bash sft_qwen2_5_3b.sh
#!/bin/bash NPROC_PER_NODE=4 CUDA_VISIBLE_DEVICES=0,1,2,3 \ swift sft \ --model Qwen/Qwen2.5-3B-Instruct \ --train_type lora \ --dataset 'data.jsonl' \ --torch_dtype bfloat16 \ --report_to wandb \ --num_train_epochs 1 \ --per_device_train_batch_size 2 \ --per_device_eval_batch_size 1 \ --learning_rate 1e-4 \ --lora_rank 8 \ --deepspeed zero3 \ --lora_alpha 32 \ --target_modules all-linear \ --gradient_accumulation_steps 16 \ --eval_steps 500 \ --save_steps 500 \ --save_total_limit 2 \ --logging_steps 50 \ --max_length 4096 \ --output_dir output \ --warmup_ratio 0.05 \ --dataset_num_proc 1 \ --dataloader_num_workers 4 \ --use_hf true
Your hardware and system info
Additional context
The text was updated successfully, but these errors were encountered:
total_batch_size = 4 * 2 * 16
ZeRO-3 is relatively slow, which I think is normal.
Sorry, something went wrong.
Hi @Jintao-Huang, thanks a lot, I have changed to zero2 and it's much faster
No branches or pull requests
Describe the bug
I'm finetuning Qwen2.5-3B-Instruct but encounter a very slow finetuning process.
Step to reproduce
Run
python gen_data.py
Run
bash sft_qwen2_5_3b.sh
Your hardware and system info
Additional context
The text was updated successfully, but these errors were encountered: