Skip to content

Commit c083f28

Browse files
Fix broken link in NovoGrad docstring (#2096)
* Fix broken link in NovoGrad docstring * Fix typo in NovoGrad paper title * Add missing parenthesis * Remove unneeded wordy phrase
1 parent f0a8c74 commit c083f28

File tree

1 file changed

+10
-10
lines changed

1 file changed

+10
-10
lines changed

tensorflow_addons/optimizers/novograd.py

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -23,16 +23,16 @@
2323

2424
@tf.keras.utils.register_keras_serializable(package="Addons")
2525
class NovoGrad(tf.keras.optimizers.Optimizer):
26-
"""The NovoGrad Optimizer was first proposed in [Stochastic Gradient
27-
Methods with Layerwise Adaptvie Moments for training of Deep
28-
Networks](https://arxiv.org/pdf/1905.11286.pdf)
29-
30-
NovoGrad is a first-order SGD-based algorithm, which computes second
31-
moments per layer instead of per weight as in Adam. Compared to Adam,
32-
NovoGrad takes less memory, and has been found to be more numerically
33-
stable. More specifically we compute (for more information on the
34-
computation please refer to this
35-
[link](https://nvidia.github.io/OpenSeq2Seq/html/optimizers.html):
26+
"""Optimizer that implements NovoGrad.
27+
28+
The NovoGrad Optimizer was first proposed in [Stochastic Gradient
29+
Methods with Layerwise Adaptive Moments for training of Deep
30+
Networks](https://arxiv.org/pdf/1905.11286.pdf) NovoGrad is a
31+
first-order SGD-based algorithm, which computes second moments per
32+
layer instead of per weight as in Adam. Compared to Adam, NovoGrad
33+
takes less memory, and has been found to be more numerically stable.
34+
(For more information on the computation please refer to this
35+
[link](https://nvidia.github.io/OpenSeq2Seq/html/optimizers.html))
3636
3737
Second order moment = exponential moving average of Layer-wise square
3838
of grads:

0 commit comments

Comments
 (0)