Skip to content

Conversation

lramos15
Copy link
Member

@Copilot Copilot AI review requested due to automatic review settings October 17, 2025 17:47
@lramos15 lramos15 enabled auto-merge October 17, 2025 17:47
@lramos15 lramos15 self-assigned this Oct 17, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Implements a multi-layer (active/standby/reserve) automode token cache to reduce latency and improve token reuse across conversations.

  • Introduces ConversationCacheEntry with active/standby promotion logic.
  • Adds a global reserve token and background refresh of standby tokens.
  • Replaces the previous single-token cache and request stickiness logic.

Comment on lines 161 to 162
// Add 3s delay to test slow latency
await new Promise(resolve => setTimeout(resolve, 3000));
Copy link

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hard-coded 3s artificial delay will add unnecessary latency to every token fetch; remove it or guard it behind a debug/experiment flag so production calls are not penalized.

Suggested change
// Add 3s delay to test slow latency
await new Promise(resolve => setTimeout(resolve, 3000));
// Add 3s delay to test slow latency (only if debug flag is set)
if (process.env.AUTOMODE_TOKEN_DELAY === 'true') {
await new Promise(resolve => setTimeout(resolve, 3000));
}

Copilot uses AI. Check for mistakes.

@vs-code-engineering vs-code-engineering bot added this to the October 2025 milestone Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto model fetching is slowing down first request by ~150ms (in Europe)

3 participants