Skip to content

Fix telemetry performance issues #31856

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

spbolton
Copy link
Contributor

@spbolton spbolton commented Apr 7, 2025

Telemetry Performance Improvements

This PR addresses several critical issues in the telemetry implementation that were impacting system performance and stability. The changes focus on thread pool management, memory usage optimization, and database operation efficiency.

Thread Pool Configuration Improvements

  • Replaced unbounded queue with a bounded queue (1000 capacity) in ApiMetricFactorySubmitter
  • Set reasonable thread pool sizes (default: 4, max: 10)
  • Implemented thread-safe initialization using AtomicReference
  • Added proper error handling for rejected executions

Memory Usage Optimization

  • Limited request buffering to 1MB in ApiMetricWebInterceptor
  • Added proper resource cleanup in the RereadInputStreamRequest
  • Prevented unnecessary buffering of large request bodies
  • Improved stream handling with proper cleanup

Database Operation Efficiency

  • Implemented batch processing in ApiMetricAPIImpl to reduce database overhead
  • Added queue-based collection of metrics with scheduled batch processing
  • Improved transaction handling with proper exception management
  • Added index to the metrics_temp table to improve query performance

Error Handling and Resilience

  • Ensured telemetry issues don't affect core application functionality
  • Added debug-level logging for telemetry errors instead of propagating exceptions
  • Implemented graceful shutdown of resources

These improvements should significantly reduce the performance impact of the telemetry system on the server, preventing memory leaks, thread pool exhaustion, and database bottlenecks.

Copy link

Please use a Conventional Commit title format for this PR. For more information, see https://www.conventionalcommits.org/en/v1.0.0/

final String requestURI = req.getRequestURI();

// Only process API requests
if (!requestURI.contains("/api/")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the interceptors already does that on the getFilters, you can limited the request scope right there

Copy link

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale label May 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants