Skip to content

Commit cf4e554

Browse files
Deepminddiegolascasas
authored andcommitted
[Automated] Update package documentation.
PiperOrigin-RevId: 285355664
1 parent 4026291 commit cf4e554

File tree

1 file changed

+10
-6
lines changed

1 file changed

+10
-6
lines changed

docs/trfl.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ Extract as much static shape information from a tensor as possible.
8686
statically-known number of dimensions.
8787

8888

89-
### [`categorical_dist_double_qlearning(atoms_tm1, logits_q_tm1, a_tm1, r_t, pcont_t, atoms_t, logits_q_t, q_t_selector, name='CategoricalDistDoubleQLearning')`](https://github.com/deepmind/trfl/blob/master/trfl/dist_value_ops.py?l=149)<!-- RULE: categorical_dist_double_qlearning .code-reference -->
89+
### [`categorical_dist_double_qlearning(atoms_tm1, logits_q_tm1, a_tm1, r_t, pcont_t, atoms_t, logits_q_t, q_t_selector, name='CategoricalDistDoubleQLearning')`](https://github.com/deepmind/trfl/blob/master/trfl/dist_value_ops.py?l=148)<!-- RULE: categorical_dist_double_qlearning .code-reference -->
9090

9191
Implements Distributional Double Q-learning as TensorFlow ops.
9292

@@ -132,7 +132,7 @@ Hessel, Modayil, van Hasselt, Schaul et al.
132132
* `ValueError`: If the tensors do not have the correct rank or compatibility.
133133

134134

135-
### [`categorical_dist_qlearning(atoms_tm1, logits_q_tm1, a_tm1, r_t, pcont_t, atoms_t, logits_q_t, name='CategoricalDistQLearning')`](https://github.com/deepmind/trfl/blob/master/trfl/dist_value_ops.py?l=74)<!-- RULE: categorical_dist_qlearning .code-reference -->
135+
### [`categorical_dist_qlearning(atoms_tm1, logits_q_tm1, a_tm1, r_t, pcont_t, atoms_t, logits_q_t, name='CategoricalDistQLearning')`](https://github.com/deepmind/trfl/blob/master/trfl/dist_value_ops.py?l=73)<!-- RULE: categorical_dist_qlearning .code-reference -->
136136

137137
Implements Distributional Q-learning as TensorFlow ops.
138138

@@ -172,7 +172,7 @@ Dabney and Munos. (https://arxiv.org/abs/1707.06887).
172172
* `ValueError`: If the tensors do not have the correct rank or compatibility.
173173

174174

175-
### [`categorical_dist_td_learning(atoms_tm1, logits_v_tm1, r_t, pcont_t, atoms_t, logits_v_t, name='CategoricalDistTDLearning')`](https://github.com/deepmind/trfl/blob/master/trfl/dist_value_ops.py?l=231)<!-- RULE: categorical_dist_td_learning .code-reference -->
175+
### [`categorical_dist_td_learning(atoms_tm1, logits_v_tm1, r_t, pcont_t, atoms_t, logits_v_t, name='CategoricalDistTDLearning')`](https://github.com/deepmind/trfl/blob/master/trfl/dist_value_ops.py?l=230)<!-- RULE: categorical_dist_td_learning .code-reference -->
176176

177177
Implements Distributional TD-learning as TensorFlow ops.
178178

@@ -617,7 +617,7 @@ The update rule is:
617617
An op that periodically updates `target_variables` with `source_variables`.
618618

619619

620-
### [`periodically(body, period, name='periodically')`](https://github.com/deepmind/trfl/blob/master/trfl/periodic_ops.py?l=34)<!-- RULE: periodically .code-reference -->
620+
### [`periodically(body, period, counter=None, name='periodically')`](https://github.com/deepmind/trfl/blob/master/trfl/periodic_ops.py?l=34)<!-- RULE: periodically .code-reference -->
621621

622622
Periodically performs a tensorflow op.
623623

@@ -637,6 +637,10 @@ If `period` is 0 or `None`, it would not perform any op and would return a
637637
an internal counter is divisible by the period. The op must have no
638638
output (for example, a tf.group()).
639639
* `period`: inverse frequency with which to perform the op.
640+
* `counter`: an optional tensorflow variable to use as a counter relative to the
641+
period. It will be incremented per call and reset to 1 in every update. In
642+
order to ensure that `body` is run in the first count, initialize the
643+
counter at a value bigger than `period`.
640644
* `name`: name of the variable_scope.
641645

642646
##### Raises:
@@ -685,7 +689,7 @@ by Bellemare, Ostrovski, Guez et al. (https://arxiv.org/abs/1512.04860).
685689
* `td_error`: batch of temporal difference errors, shape `[B]`.
686690

687691

688-
### [`pixel_control_loss(observations, actions, action_values, cell_size, discount_factor, scale, crop_height_dim=(None, None), crop_width_dim=(None, None))`](https://github.com/deepmind/trfl/blob/master/trfl/pixel_control_ops.py?l=92)<!-- RULE: pixel_control_loss .code-reference -->
692+
### [`pixel_control_loss(observations, actions, action_values, cell_size, discount_factor, scale, crop_height_dim=(None, None), crop_width_dim=(None, None))`](https://github.com/deepmind/trfl/blob/master/trfl/pixel_control_ops.py?l=95)<!-- RULE: pixel_control_loss .code-reference -->
689693

690694
Calculate n-step Q-learning loss for pixel control auxiliary task.
691695

@@ -735,7 +739,7 @@ Mnih, Czarnecki et al. (https://arxiv.org/abs/1611.05397).
735739
the pseudo-rewards derived from the observations.
736740

737741

738-
### [`pixel_control_rewards(observations, cell_size)`](https://github.com/deepmind/trfl/blob/master/trfl/pixel_control_ops.py?l=42)<!-- RULE: pixel_control_rewards .code-reference -->
742+
### [`pixel_control_rewards(observations, cell_size)`](https://github.com/deepmind/trfl/blob/master/trfl/pixel_control_ops.py?l=41)<!-- RULE: pixel_control_rewards .code-reference -->
739743

740744
Calculates pixel control task rewards from observation sequence.
741745

0 commit comments

Comments
 (0)