Skip to content

Commit 93eeb83

Browse files
Merge pull request tensorflow#2140 from synandi:patch-14
PiperOrigin-RevId: 484066984
2 parents b37ef65 + 2e3d17e commit 93eeb83

File tree

2 files changed

+6
-16
lines changed

2 files changed

+6
-16
lines changed

site/en/guide/mixed_precision.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -411,7 +411,7 @@
411411
"id": "0Sm8FJHegVRN"
412412
},
413413
"source": [
414-
"This example cast the input data from int8 to float32. You don't cast to float16 since the division by 255 is on the CPU, which runs float16 operations slower than float32 operations. In this case, the performance difference in negligible, but in general you should run input processing math in float32 if it runs on the CPU. The first layer of the model will cast the inputs to float16, as each layer casts floating-point inputs to its compute dtype.\n",
414+
"This example casts the input data from int8 to float32. You don't cast to float16 since the division by 255 is on the CPU, which runs float16 operations slower than float32 operations. In this case, the performance difference is negligible, but in general you should run input processing math in float32 if it runs on the CPU. The first layer of the model will cast the inputs to float16, as each layer casts floating-point inputs to its compute dtype.\n",
415415
"\n",
416416
"The initial weights of the model are retrieved. This will allow training from scratch again by loading the weights."
417417
]
@@ -465,7 +465,7 @@
465465
" \n",
466466
"If you are running this guide in Colab, you can compare the performance of mixed precision with float32. To do so, change the policy from `mixed_float16` to `float32` in the \"Setting the dtype policy\" section, then rerun all the cells up to this point. On GPUs with compute capability 7.X, you should see the time per step significantly increase, indicating mixed precision sped up the model. Make sure to change the policy back to `mixed_float16` and rerun the cells before continuing with the guide.\n",
467467
"\n",
468-
"On GPUs with compute capability of at least 8.0 (Ampere GPUs and above), you likely will see no performance improvement in the toy model in this guide when using mixed precision compared to float32. This is due to the use of [TensorFloat-32](https://www.tensorflow.org/api_docs/python/tf/config/experimental/enable_tensor_float_32_execution), which automatically uses lower precision math in certain float32 ops such as `tf.linalg.matmul`. TensorFloat-32 gives some of the performance advantages of mixed precision when using float32. However, in real-world models, you will still typically see significantly performance improvements from mixed precision due to memory bandwidth savings and ops which TensorFloat-32 does not support.\n",
468+
"On GPUs with compute capability of at least 8.0 (Ampere GPUs and above), you likely will see no performance improvement in the toy model in this guide when using mixed precision compared to float32. This is due to the use of [TensorFloat-32](https://www.tensorflow.org/api_docs/python/tf/config/experimental/enable_tensor_float_32_execution), which automatically uses lower precision math in certain float32 ops such as `tf.linalg.matmul`. TensorFloat-32 gives some of the performance advantages of mixed precision when using float32. However, in real-world models, you will still typically experience significant performance improvements from mixed precision due to memory bandwidth savings and ops which TensorFloat-32 does not support.\n",
469469
"\n",
470470
"If running mixed precision on a TPU, you will not see as much of a performance gain compared to running mixed precision on GPUs, especially pre-Ampere GPUs. This is because TPUs do certain ops in bfloat16 under the hood even with the default dtype policy of float32. This is similar to how Ampere GPUs use TensorFloat-32 by default. Compared to Ampere GPUs, TPUs typically see less performance gains with mixed precision on real-world models.\n",
471471
"\n",
@@ -612,7 +612,7 @@
612612
"id": "FVy5gnBqTE9z"
613613
},
614614
"source": [
615-
"If you want, it is possible choose an explicit loss scale or otherwise customize the loss scaling behavior, but it is highly recommended to keep the default loss scaling behavior, as it has been found to work well on all known models. See the `tf.keras.mixed_precision.LossScaleOptimizer` documention if you want to customize the loss scaling behavior."
615+
"If you want, it is possible choose an explicit loss scale or otherwise customize the loss scaling behavior, but it is highly recommended to keep the default loss scaling behavior, as it has been found to work well on all known models. See the `tf.keras.mixed_precision.LossScaleOptimizer` documentation if you want to customize the loss scaling behavior."
616616
]
617617
},
618618
{

site/en/tutorials/text/image_captioning.ipynb

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -485,9 +485,7 @@
485485
"source": [
486486
"### Image feature extractor\n",
487487
"\n",
488-
"You will use an image model (pretrained on imagenet) to extract the features from each image. The model was trained as an image classifier, but setting `include_top=False` returns the model without the final classification layer, so you can use the last layer of feature-maps: \n",
489-
"\n",
490-
"\n"
488+
"You will use an image model (pretrained on imagenet) to extract the features from each image. The model was trained as an image classifier, but setting `include_top=False` returns the model without the final classification layer, so you can use the last layer of feature-maps: \n"
491489
]
492490
},
493491
{
@@ -1052,8 +1050,6 @@
10521050
"id": "qiRXWwIKNybB"
10531051
},
10541052
"source": [
1055-
"\n",
1056-
"\n",
10571053
"The model will be implemented in three main parts: \n",
10581054
"\n",
10591055
"1. Input - The token embedding and positional encoding (`SeqEmbedding`).\n",
@@ -1163,8 +1159,7 @@
11631159
" attn = self.mha(query=x, value=x,\n",
11641160
" use_causal_mask=True)\n",
11651161
" x = self.add([x, attn])\n",
1166-
" return self.layernorm(x)\n",
1167-
"\n"
1162+
" return self.layernorm(x)\n"
11681163
]
11691164
},
11701165
{
@@ -1304,8 +1299,6 @@
13041299
"id": "6WQD87efena5"
13051300
},
13061301
"source": [
1307-
"\n",
1308-
"\n",
13091302
"But there are a few other features you can add to make this work a little better:\n",
13101303
"\n",
13111304
"1. **Handle bad tokens**: The model will be generating text. It should\n",
@@ -1483,8 +1476,7 @@
14831476
"1. Flatten the extracted image features, so they can be input to the decoder layers.\n",
14841477
"2. Look up the token embeddings.\n",
14851478
"3. Run the stack of `DecoderLayer`s, on the image features and text embeddings.\n",
1486-
"4. Run the output layer to predict the next token at each position.\n",
1487-
"\n"
1479+
"4. Run the output layer to predict the next token at each position.\n"
14881480
]
14891481
},
14901482
{
@@ -2143,8 +2135,6 @@
21432135
"colab": {
21442136
"collapsed_sections": [],
21452137
"name": "image_captioning.ipynb",
2146-
"private_outputs": true,
2147-
"provenance": [],
21482138
"toc_visible": true
21492139
},
21502140
"kernelspec": {

0 commit comments

Comments
 (0)