You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
like $Wu + b$ the bias parameter $b$ gets cancelled due to normalization.
127
127
So you can and should omit bias parameter in linear transforms right before the
128
128
batch normalization.</p>
129
-
<p>Batch normalization also makes the back propagation invariant to the scale of the weights.
130
-
And empirically it improves generalization, so it has regularization effects too.</p>
129
+
<p>Batch normalization also makes the back propagation invariant to the scale of the weights
130
+
and empirically it improves generalization, so it has regularization effects too.</p>
131
131
<h2>Inference</h2>
132
132
<p>We need to know $\mathbb{E}[x^{(k)}]$ and $Var[x^{(k)}]$ in order to
133
133
perform the normalization.
134
134
So during inference, you either need to go through the whole (or part of) dataset
135
135
and find the mean and variance, or you can use an estimate calculated during training.
136
136
The usual practice is to calculate an exponential moving average of
137
137
mean and variance during the training phase and use that for inference.</p>
138
-
<p>Here’s <ahref="https://nn.labml.ai/normalization/layer_norm/mnist.html">the training code</a> and a notebook for training
139
-
a CNN classifier that use batch normalization for MNIST dataset.</p>
138
+
<p>Here’s <ahref="mnist.html">the training code</a> and a notebook for training
139
+
a CNN classifier that uses batch normalization for MNIST dataset.</p>
140
140
<p><ahref="https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/normalization/batch_norm/mnist.ipynb"><imgalt="Open In Colab" src="https://colab.research.google.com/assets/colab-badge.svg" /></a>
0 commit comments