We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
- paddlepaddle:2.5.0 - paddlepaddle-xpu: 2.5.0 - paddlenlp: 2.6.0rc0.post0
调用fleet.meta_parallel.get_rng_state_tracker().get_states_tracker()保存rng状态到字典结构中,通过paddle.save保存报错 TypeError: cannot pickle 'paddle.fluid.libpaddle.Generatorstate'object 原因是:Generatorstate是c++的struct结构,无法通过pickle进行序列化。
export PYTHONPATH="../../PaddleNLP/"
log_dir="log" rm -rf $log_dir
python3 -m paddle.distributed.launch --xpus "0" --log_dir ${log_dir} finetune_generation.py --model_type "gpt" --model_name_or_path gpt2-medium-en --output_dir "output/$task_name" --per_device_train_batch_size 2 --per_device_eval_batch_size 1 --tensor_parallel_degree 1 --pipeline_parallel_degree 1 --scale_loss 1024 --learning_rate 0.00001 --max_steps 10000 --save_steps 5000 --weight_decay 0.01 --warmup_ratio 0.01 --max_grad_norm 1.0 --logging_steps 1 --dataloader_num_workers 1 --sharding "stage2" --max_evaluate_steps 1000 --eval_steps 1000 --report_to "visualdl" --disable_tqdm true --recompute 1 --gradient_accumulation_steps 2 --do_train --do_eval --device "xpu"
The text was updated successfully, but these errors were encountered:
您好,可以先升级到最新版本试一下,另外我们还没有计划支持xpu,需要内部讨论一下,欢迎开发者贡献哈。
Sorry, something went wrong.
wawltor
No branches or pull requests
软件环境
重复问题
错误描述
稳定复现步骤 & 代码
export PYTHONPATH="../../PaddleNLP/"
log_dir="log"
rm -rf $log_dir
python3 -m paddle.distributed.launch
--xpus "0"
--log_dir ${log_dir}
finetune_generation.py
--model_type "gpt"
--model_name_or_path gpt2-medium-en
--output_dir "output/$task_name"
--per_device_train_batch_size 2
--per_device_eval_batch_size 1
--tensor_parallel_degree 1
--pipeline_parallel_degree 1
--scale_loss 1024
--learning_rate 0.00001
--max_steps 10000
--save_steps 5000
--weight_decay 0.01
--warmup_ratio 0.01
--max_grad_norm 1.0
--logging_steps 1
--dataloader_num_workers 1
--sharding "stage2"
--max_evaluate_steps 1000
--eval_steps 1000
--report_to "visualdl"
--disable_tqdm true
--recompute 1
--gradient_accumulation_steps 2
--do_train
--do_eval
--device "xpu"
The text was updated successfully, but these errors were encountered: