Skip to content

Conversation

alexander-soare
Copy link
Contributor

@alexander-soare alexander-soare commented May 5, 2024

What does this PR do?

As the title suggests. Also updates test artifacts for diffusion policy backwards compatibility check.

Side change:

  • Adds documentation for workflow required to update the test artifacts for the policy b/c checks.

How was it tested?

I have a pretrained diffusion policy that reaches SOTA eval metrics. I evaluated it for 500 episodes using the EMA vs non-EMA weights.

{
  "ema": {
    "avg_sum_reward": 104.15279900893682,
    "avg_max_reward": 0.9495501381835949,
    "pc_success": 64.4,
    "eval_s": 570.6887876987457,
    "eval_ep_s": 1.1413775758743285
  }
  "non_ema": {
    "avg_sum_reward": 108.0149595484424,
    "avg_max_reward": 0.9588220574225956,
    "pc_success": 63.800000000000004,
    "eval_s": 555.2821447849274,
    "eval_ep_s": 1.1105642900466919
  },
}

The mean "avg_max_reward" is higher for non-EMA (without considering error-bars). For success rate we can take a uniform prior and calculate the posterior beta distribution to get the mean, upper confidence bound (mean + 34.1%) and lower confidence bound (mean - 34.1%)

import scipy.stats

num_episodes = 500
success_rate = 0.644  # or 0.638 for non-ema

alpha = num_episodes * success_rate + 1
beta = num_episodes * (1 - success_rate) + 1

confidence_interval = 0.682
lower_percentile = (1 - confidence_interval) / 2
upper_percentile = 1 - lower_percentile

print("Mean:", scipy.stats.beta.mean(alpha, beta))
print("Lower:", scipy.stats.beta.ppf(lower_percentile, alpha, beta))
print("Upper:", scipy.stats.beta.ppf(upper
[ema.json](https://github.com/huggingface/lerobot/files/15212103/ema.json)
[no_ema.json](https://github.com/huggingface/lerobot/files/15212104/no_ema.json)
_percentile, alpha, beta))

EMA results:
Mean: 0.6434262948207171
Lower: 0.6220813179507085
Upper: 0.6647718949623826

Non-EMA results:
Mean: 0.6374501992031872
Lower: 0.6160270432185413
Upper: 0.6588739530055447

The means are not significantly different.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR. Try to avoid tagging more than 3 people.

@alexander-soare alexander-soare added the policies Items related to robot policies label May 5, 2024
@alexander-soare alexander-soare requested review from Cadene and aliberts May 5, 2024 07:51
Copy link
Collaborator

@Cadene Cadene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Genius! so happy we removed EMA!

Comment on lines 93 to 98
# (
# "pusht",
# "diffusion",
# ["policy.n_action_steps=8", "policy.num_inference_steps=10", "policy.down_dims=[128, 256, 512]"],
# ),
# ("aloha", "act", ["policy.n_action_steps=10"]),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did, and updated instructions.

["policy.n_action_steps=8", "policy.num_inference_steps=10", "policy.down_dims=[128, 256, 512]"],
),
("aloha", "act", ["policy.n_action_steps=10"]),
# ("aloha", "act", ["policy.n_action_steps=10"]),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be uncommented actually. I reverted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be uncommented actually. I reverted.

Was about to say the same, nice!

Comment on lines -112 to -114
Note: this method uses the ema model weights if self.training == False, otherwise the non-ema model
weights.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a note about EMA, saying that we tested with and without, and got as good or better results without EMA, so we decided to remove it for sake of simplicity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay but I added them in the yaml config as this detail is more relevant to the outer scope. Ptal

Copy link
Collaborator

@aliberts aliberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

["policy.n_action_steps=8", "policy.num_inference_steps=10", "policy.down_dims=[128, 256, 512]"],
),
("aloha", "act", ["policy.n_action_steps=10"]),
# ("aloha", "act", ["policy.n_action_steps=10"]),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be uncommented actually. I reverted.

Was about to say the same, nice!

Comment on lines 252 to 262
"""
NOTE: If this test does not pass, and you have intentionally changed something in the policy:
1. Inspect the differences in policy outputs and make sure you can account for them. Your PR should
include a report on what changed and how that affected the outputs.
2. Go to the `if __name__ == "__main__"` block of `test/scripts/save_policy_to_safetensors.py` and
comment in the policies you want to update the test artifacts for.
3. Run `python test/scripts/save_policy_to_safetensors.py`. The test artifact should be updated.
4. Check that this test now passes.
5. Remember to restore `test/scripts/save_policy_to_safetensors.py` to its original state.
6. Remember to stage and commit the resulting changes to `tests/data`.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have done that, that's really helpful, thanks!

@alexander-soare alexander-soare merged commit f3bba02 into huggingface:main May 5, 2024
@alexander-soare alexander-soare deleted the remove_ema_from_dp branch May 5, 2024 10:26
menhguin pushed a commit to menhguin/lerobot that referenced this pull request Feb 9, 2025
Kalcy-U referenced this pull request in Kalcy-U/lerobot May 13, 2025
ZoreAnuj pushed a commit to luckyrobots/lerobot that referenced this pull request Jul 29, 2025
Ricci084 pushed a commit to JeffWang987/lerobot that referenced this pull request Sep 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
policies Items related to robot policies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants