Skip to content

Conversation

@TomeHirata
Copy link
Collaborator

@TomeHirata TomeHirata commented Mar 27, 2025

Some users reported that their programs were stuck due to Rate limit errors. This PR configures exponential backoff retry for LiteLLM completion call since LiteLLM uses constant backoff even for rate limits (ref), which is ineffective.

One tradeoff here is that we will start using exponential backoff for other types of exceptions (e.g. internal server error) after this change. LiteLLM has a smart logic for async completion that it switches to exponential backoff only for RateLimitError (ref), but this does not exist for sync completion. Therefore, another solution is that we file a PR to LiteLLM side to implement the logic for sync completion to use exponential backoff only to RateLimitError.

@TomeHirata TomeHirata force-pushed the feat/exponential_backoff_retry branch from 04624d5 to 7858751 Compare March 31, 2025 07:50
@TomeHirata TomeHirata force-pushed the feat/exponential_backoff_retry branch from 7858751 to 70ac8ae Compare March 31, 2025 07:56
@okhat okhat merged commit 7a877d1 into stanfordnlp:main Mar 31, 2025
4 checks passed
@TomeHirata TomeHirata mentioned this pull request Apr 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants