Skip to content

[Question]: 关于utc-base和utc-large如何训练的? #6541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Jeffery-lord opened this issue Jul 28, 2023 · 2 comments
Closed

[Question]: 关于utc-base和utc-large如何训练的? #6541

Jeffery-lord opened this issue Jul 28, 2023 · 2 comments
Assignees
Labels
question Further information is requested triage

Comments

@Jeffery-lord
Copy link

请提出你的问题

我有两个问题:
1.为什么在中文数据集的效果上,utc-large还比不上uct-base,我测试了三个数据集,两个开放数据集和一个自己的数据集,基本上都是这样的结论。utc-large和utc-base应该是只有训练的语料库不一样,utc-large参数量越大,反而效果越差。
2.utc-base和utc-large的基座模型是erine3.0嘛,按照usm的论文描述,训练格式和微调格式是一样的,那是不是可以认为预训练和微调的代码是都是run_train.py,只不过是基座模型不一样。如果是这样的话,想问问erine3.0到utc中使用了哪些中文的分类数据集呢?

@Jeffery-lord Jeffery-lord added the question Further information is requested label Jul 28, 2023
@JoshonSmith
Copy link

同问

@w5688414
Copy link
Contributor

w5688414 commented May 7, 2024

效果不好的原因可能与学习率等有关,模型越大越容易过拟合,需要的数据就越多。

另外,目前没有开放utc-base中文模型训练细节的计划。

@paddle-bot paddle-bot bot closed this as completed May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested triage
Projects
None yet
Development

No branches or pull requests

4 participants