Skip to content

Commit f577e5a

Browse files
hyeygittensorflower-gardener
authored andcommitted
Add detailed explanation, examples and caveats of convert_image_dtype function.
`convert_image_dtype` is one of the most frequently used functions provided in the tf.image API. However, the API doc significantly lacks in explanation and examples. This constantly causes confusion to users and this CL is for addressing the needs. PiperOrigin-RevId: 342436750 Change-Id: Ia42d69a280e6bf8194f66cbc0a8e5976bac93187
1 parent 1d1180a commit f577e5a

File tree

1 file changed

+95
-13
lines changed

1 file changed

+95
-13
lines changed

tensorflow/python/ops/image_ops_impl.py

Lines changed: 95 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2200,6 +2200,10 @@ def adjust_gamma(image, gamma=1, gain=1):
22002200
def convert_image_dtype(image, dtype, saturate=False, name=None):
22012201
"""Convert `image` to `dtype`, scaling its values if needed.
22022202
2203+
The operation supports data types (for `image` and `dtype`) of
2204+
`uint8`, `uint16`, `uint32`, `uint64`, `int8`, `int16`, `int32`, `int64`,
2205+
`float16`, `float32`, `float64`, `bfloat16`.
2206+
22032207
Images that are represented using floating point values are expected to have
22042208
values in the range [0,1). Image data stored in integer data types are
22052209
expected to have values in the range `[0,MAX]`, where `MAX` is the largest
@@ -2208,6 +2212,97 @@ def convert_image_dtype(image, dtype, saturate=False, name=None):
22082212
This op converts between data types, scaling the values appropriately before
22092213
casting.
22102214
2215+
Usage Example:
2216+
2217+
>>> x = [[[1, 2, 3], [4, 5, 6]],
2218+
... [[7, 8, 9], [10, 11, 12]]]
2219+
>>> x_int8 = tf.convert_to_tensor(x, dtype=tf.int8)
2220+
>>> tf.image.convert_image_dtype(x_int8, dtype=tf.float16, saturate=False)
2221+
<tf.Tensor: shape=(2, 2, 3), dtype=float16, numpy=
2222+
array([[[0.00787, 0.01575, 0.02362],
2223+
[0.0315 , 0.03937, 0.04724]],
2224+
[[0.0551 , 0.063 , 0.07086],
2225+
[0.07874, 0.0866 , 0.0945 ]]], dtype=float16)>
2226+
2227+
Converting integer types to floating point types returns normalized floating
2228+
point values in the range [0, 1); the values are normalized by the `MAX` value
2229+
of the input dtype. Consider the following two examples:
2230+
2231+
>>> a = [[[1], [2]], [[3], [4]]]
2232+
>>> a_int8 = tf.convert_to_tensor(a, dtype=tf.int8)
2233+
>>> tf.image.convert_image_dtype(a_int8, dtype=tf.float32)
2234+
<tf.Tensor: shape=(2, 2, 1), dtype=float32, numpy=
2235+
array([[[0.00787402],
2236+
[0.01574803]],
2237+
[[0.02362205],
2238+
[0.03149606]]], dtype=float32)>
2239+
2240+
>>> a_int32 = tf.convert_to_tensor(a, dtype=tf.int32)
2241+
>>> tf.image.convert_image_dtype(a_int32, dtype=tf.float32)
2242+
<tf.Tensor: shape=(2, 2, 1), dtype=float32, numpy=
2243+
array([[[4.6566129e-10],
2244+
[9.3132257e-10]],
2245+
[[1.3969839e-09],
2246+
[1.8626451e-09]]], dtype=float32)>
2247+
2248+
Despite having identical values of `a` and output dtype of `float32`, the
2249+
outputs differ due to the different input dtypes (`int8` vs. `int32`). This
2250+
is, again, because the values are normalized by the `MAX` value of the input
2251+
dtype.
2252+
2253+
Note that converting floating point values to integer type may lose precision.
2254+
In the example below, an image tensor `b` of dtype `float32` is converted to
2255+
`int8` and back to `float32`. The final output, howeverm is different from
2256+
the original input `b` due to precision loss.
2257+
2258+
>>> b = [[[0.12], [0.34]], [[0.56], [0.78]]]
2259+
>>> b_float32 = tf.convert_to_tensor(b, dtype=tf.float32)
2260+
>>> b_int8 = tf.image.convert_image_dtype(b_float32, dtype=tf.int8)
2261+
>>> tf.image.convert_image_dtype(b_int8, dtype=tf.float32)
2262+
<tf.Tensor: shape=(2, 2, 1), dtype=float32, numpy=
2263+
array([[[0.11811024],
2264+
[0.33858266]],
2265+
[[0.5590551 ],
2266+
[0.77952754]]], dtype=float32)>
2267+
2268+
Scaling up from an integer type (input dtype) to another integer type (output
2269+
dtype) will not map input dtype's `MAX` to output dtype's `MAX` but converting
2270+
back and forth should result in no change. For example, as shown below, the
2271+
`MAX` value of int8 (=127) is not mapped to the `MAX` value of int16 (=32,767)
2272+
but, when scaled back, we get the same, original values of `c`.
2273+
2274+
>>> c = [[[1], [2]], [[127], [127]]]
2275+
>>> c_int8 = tf.convert_to_tensor(c, dtype=tf.int8)
2276+
>>> c_int16 = tf.image.convert_image_dtype(c_int8, dtype=tf.int16)
2277+
>>> print(c_int16)
2278+
tf.Tensor(
2279+
[[[ 256]
2280+
[ 512]]
2281+
[[32512]
2282+
[32512]]], shape=(2, 2, 1), dtype=int16)
2283+
>>> c_int8_back = tf.image.convert_image_dtype(c_int16, dtype=tf.int8)
2284+
>>> print(c_int8_back)
2285+
tf.Tensor(
2286+
[[[ 1]
2287+
[ 2]]
2288+
[[127]
2289+
[127]]], shape=(2, 2, 1), dtype=int8)
2290+
2291+
Scaling down from an integer type to another integer type can be a lossy
2292+
conversion. Notice in the example below that converting `int16` to `uint8` and
2293+
back to `int16` has lost precision.
2294+
2295+
>>> d = [[[1000], [2000]], [[3000], [4000]]]
2296+
>>> d_int16 = tf.convert_to_tensor(d, dtype=tf.int16)
2297+
>>> d_uint8 = tf.image.convert_image_dtype(d_int16, dtype=tf.uint8)
2298+
>>> d_int16_back = tf.image.convert_image_dtype(d_uint8, dtype=tf.int16)
2299+
>>> print(d_int16_back)
2300+
tf.Tensor(
2301+
[[[ 896]
2302+
[1920]]
2303+
[[2944]
2304+
[3968]]], shape=(2, 2, 1), dtype=int16)
2305+
22112306
Note that converting from floating point inputs to integer types may lead to
22122307
over/underflow problems. Set saturate to `True` to avoid such problem in
22132308
problematic conversions. If enabled, saturation will clip the output into the
@@ -2216,19 +2311,6 @@ def convert_image_dtype(image, dtype, saturate=False, name=None):
22162311
type, and when casting from a signed to an unsigned type; `saturate` has no
22172312
effect on casts between floats, or on casts that increase the type's range).
22182313
2219-
Usage Example:
2220-
2221-
>>> x = [[[1.0, 2.0, 3.0],
2222-
... [4.0, 5.0, 6.0]],
2223-
... [[7.0, 8.0, 9.0],
2224-
... [10.0, 11.0, 12.0]]]
2225-
>>> tf.image.convert_image_dtype(x, dtype=tf.float16, saturate=False)
2226-
<tf.Tensor: shape=(2, 2, 3), dtype=float16, numpy=
2227-
array([[[ 1., 2., 3.],
2228-
[ 4., 5., 6.]],
2229-
[[ 7., 8., 9.],
2230-
[10., 11., 12.]]], dtype=float16)>
2231-
22322314
Args:
22332315
image: An image.
22342316
dtype: A `DType` to convert `image` to.

0 commit comments

Comments
 (0)