Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue #2707

Deleter-D · 2025-07-04T03:45:41Z

When using MTP and tie_word_embeddings=true, unexpected weight transposition behavior may occur.

The specific reason is that the eh_proj layer in MTP reuses the ParallelLMHead class. When MTP is turned on, its internal judgment logic about tie_word_embeddings will cause the eh_proj weight to produce unexpected transposition behavior. Therefore, the eh_proj layer of MTP is extracted separately to facilitate subsequent development.

paddle-bot · 2025-07-04T03:45:46Z

Thanks for your contribution!

CLAassistant · 2025-07-04T03:45:47Z

All committers have signed the CLA.

Deleter-D added 3 commits July 4, 2025 11:07

fix mtp eh_proj layer

2f2deab

fix mtp update_cfg function

7d12240

fix stringdoc

3e8a401

paddle-bot bot added the contributor label Jul 4, 2025

Jiang-Jia-Jun requested review from yuanlehome and freeliuzc July 4, 2025 03:51

simplify class name

a1514fa

freeliuzc approved these changes Jul 4, 2025

View reviewed changes

freeliuzc merged commit e7fa57e into PaddlePaddle:develop Jul 4, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue #2707

Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue #2707

Uh oh!

Deleter-D commented Jul 4, 2025

Uh oh!

paddle-bot bot commented Jul 4, 2025

Uh oh!

CLAassistant commented Jul 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue #2707

Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue #2707

Uh oh!

Conversation

Deleter-D commented Jul 4, 2025

Uh oh!

paddle-bot bot commented Jul 4, 2025

Uh oh!

CLAassistant commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CLAassistant commented Jul 4, 2025 •

edited

Loading