Skip to content

Main LSTM Weight Initialization #3

@Ar-Kareem

Description

@Ar-Kareem

Hi,

I was looking at the weight initialization and it looks like you use xavier_uniform_ for the main LSTM input-to-hidden-weights

init.xavier_uniform_(self.linear_ih.weight.data)

While in the paper they define that both the input-to-hidden-weights and the hidden-to-hidden-weights should use Orthogonal initialization. On page 20 Section A.2.3

Orthogonal initialization is applied to the Wh and Wx

Although I am not sure if the tensorflow implementation follows the paper or not, could you elaborate on why you decided to use xavier uniform or was that just a copy of the tensorflow implementation of the model, or possibly an error in the code?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions