Skip to content

Conversation

atzoum
Copy link
Contributor

@atzoum atzoum commented Sep 12, 2025

Description

Routers with destination isolation enabled are now able to adapt their jobsdb pickup query batch size dynamically based on the destination's current throttling limit.

  • This feature can be enabled through the hierarchical configuration options (default: false)
    • Router.<DEST_TYPE>.pickupQueryThrottlingEnabled
    • Router.pickupQueryThrottlingEnabled.
  • If the destination is using a single throttler for all event types, the pickup query batch size will match the current throttling limit.
  • If the destination is using a different throttlers per event type, the pickup query batch size will match the sum of all current throttling limits that have been recently used (within readSleepSeconds*2).
  • There is an upper limit on the maximum pickup query batch size controlled through (default: 10000)
    • Router.<DEST_TYPE>.maxJobQueryBatchSize
    • Router.maxJobQueryBatchSize

By dynamically adjusting the pickup query batch size to match the destination’s current throttling limits, routers no longer over-fetch jobs that cannot be processed immediately. This reduces the number of jobs queried from the database only to be discarded later due to throttling constraints. The result is lower query overhead, more efficient use of system resources, and faster end-to-end job processing since only jobs that can realistically be delivered are retrieved and no sleep penalty is added in the pickup loop due to less discarded jobs (for high job discard ratios (>60%) we impose a sleep penalty of 5 seconds per loop as a back-pressure mechanism).

Linear Ticket

resolves PIPE-2375

Security

  • The code changed/added as part of this pull request won't create any security issues with how the software is being used.

Copy link

codecov bot commented Sep 12, 2025

Codecov Report

❌ Patch coverage is 86.25954% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.60%. Comparing base (fbba498) to head (525388b).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
router/handle.go 84.61% 5 Missing and 1 partial ⚠️
router/throttler/factory.go 83.33% 6 Missing ⚠️
...ter/throttler/internal/pickup/switcher/switcher.go 33.33% 4 Missing ⚠️
...er/throttler/internal/pickup/adaptive/throttler.go 93.75% 1 Missing ⚠️
...uter/throttler/internal/pickup/static/throttler.go 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6338      +/-   ##
==========================================
+ Coverage   77.57%   77.60%   +0.02%     
==========================================
  Files         523      523              
  Lines       70266    70361      +95     
==========================================
+ Hits        54512    54602      +90     
- Misses      12913    12921       +8     
+ Partials     2841     2838       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@atzoum atzoum force-pushed the feat.routerPickupMatchThrottler branch 3 times, most recently from e84a9c0 to 38785fa Compare September 15, 2025 12:28
@atzoum atzoum changed the title [WIP] feat: router jobsdb query batch size should match throttling limit [WIP] feat: router jobsdb query batch adapting to throttling limit Sep 15, 2025
@atzoum atzoum changed the title [WIP] feat: router jobsdb query batch adapting to throttling limit [WIP] feat: router jobsdb pickup query batch size adapting to throttling limit Sep 15, 2025
@atzoum atzoum changed the title [WIP] feat: router jobsdb pickup query batch size adapting to throttling limit [WIP] feat(router): jobsdb pickup query batch size adapting to throttling limit Sep 15, 2025
@atzoum atzoum changed the title [WIP] feat(router): jobsdb pickup query batch size adapting to throttling limit feat(router): jobsdb pickup query batch size adapting to throttling limit Sep 15, 2025
@atzoum atzoum force-pushed the feat.routerPickupMatchThrottler branch 2 times, most recently from 97fd797 to bfed78c Compare September 16, 2025 07:49
@atzoum atzoum requested a review from koladilip September 16, 2025 08:57
@atzoum atzoum marked this pull request as ready for review September 16, 2025 08:57
@atzoum atzoum force-pushed the feat.routerPickupMatchThrottler branch from bfed78c to 1446112 Compare September 16, 2025 08:57
@atzoum atzoum force-pushed the feat.routerPickupMatchThrottler branch 6 times, most recently from dde391e to 7f95ef0 Compare September 23, 2025 10:46
@atzoum atzoum force-pushed the feat.routerPickupMatchThrottler branch from 7f95ef0 to be3a509 Compare September 23, 2025 11:28
@atzoum atzoum requested a review from mihir20 September 24, 2025 11:42
mihir20
mihir20 previously approved these changes Sep 24, 2025
@mihir20 mihir20 self-requested a review September 24, 2025 12:37
Copy link
Contributor

@ktgowtham ktgowtham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@atzoum atzoum merged commit c38b971 into master Sep 25, 2025
121 of 125 checks passed
@atzoum atzoum deleted the feat.routerPickupMatchThrottler branch September 25, 2025 11:12
This was referenced Sep 29, 2025
mihir20 pushed a commit that referenced this pull request Sep 29, 2025
🤖 I have created a release *beep* *boop*
---


##
[1.60.0-rc.1](v1.59.0...v1.60.0-rc.1)
(2025-09-29)


### Features

* **router:** jobsdb pickup query batch size adapting to throttling
limit ([#6338](#6338))
([c38b971](c38b971))


### Bug Fixes

* add outgoing metrics to proxy flow
([#6355](#6355))
([23a5c81](23a5c81))
* aws session config region
([#6354](#6354))
([23a5c81](23a5c81))
* dedup gauge
([#6359](#6359))
([a562df4](a562df4))
* keydb grpc config
([#6370](#6370))
([cf04743](cf04743))
* missing keydb client stats
([#6360](#6360))
([c42e5c8](c42e5c8))
* naming collision in redis throttling configuration
([#6365](#6365))
([4f87afb](4f87afb))
* set table type to external for glue
([#6386](#6386))
([9017582](9017582))
* ut mirroring tests
([#6341](#6341))
([51dd4f7](51dd4f7))
* **warehouse:** alter namespace col size
([#6379](#6379))
([fbba498](fbba498))
* **warehouse:** skip extract async job failing test
([#6378](#6378))
([89a4d1e](89a4d1e))


### Miscellaneous

* add authentication to reporting client
([#6384](#6384))
([502d2b0](502d2b0))
* add explicit permissions for workflows
([#6381](#6381))
([1d22623](1d22623))
* add readme for async destinaiton module
([#6356](#6356))
([cc80333](cc80333))
* remove deprecated throttling configuration keys
([#6377](#6377))
([8d667b7](8d667b7))
* upgrade build-scan-push-action to v1.8.0
([#6350](#6350))
([51dd4f7](51dd4f7))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
mihir20 pushed a commit that referenced this pull request Sep 30, 2025
🤖 I have created a release *beep* *boop*
---


##
[1.60.0-rc.2](v1.59.0...v1.60.0-rc.2)
(2025-09-30)


### Features

* **router:** jobsdb pickup query batch size adapting to throttling
limit ([#6338](#6338))
([c38b971](c38b971))


### Bug Fixes

* add outgoing metrics to proxy flow
([#6355](#6355))
([23a5c81](23a5c81))
* aws session config region
([#6354](#6354))
([23a5c81](23a5c81))
* dedup gauge
([#6359](#6359))
([a562df4](a562df4))
* keydb grpc config
([#6370](#6370))
([cf04743](cf04743))
* missing keydb client stats
([#6360](#6360))
([c42e5c8](c42e5c8))
* naming collision in redis throttling configuration
([#6365](#6365))
([4f87afb](4f87afb))
* set table type to external for glue
([#6386](#6386))
([9017582](9017582))
* ut mirroring tests
([#6341](#6341))
([51dd4f7](51dd4f7))
* **warehouse:** alter namespace col size
([#6379](#6379))
([fbba498](fbba498))
* **warehouse:** skip extract async job failing test
([#6378](#6378))
([89a4d1e](89a4d1e))


### Miscellaneous

* add authentication to reporting client
([#6384](#6384))
([502d2b0](502d2b0))
* add explicit permissions for workflows
([#6381](#6381))
([1d22623](1d22623))
* add readme for async destinaiton module
([#6356](#6356))
([cc80333](cc80333))
* configurable event name trimming for reporting
([#6394](#6394))
([7628efa](7628efa))
* remove deprecated throttling configuration keys
([#6377](#6377))
([8d667b7](8d667b7))
* upgrade build-scan-push-action to v1.8.0
([#6350](#6350))
([51dd4f7](51dd4f7))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
mihir20 pushed a commit that referenced this pull request Sep 30, 2025
🤖 I have created a release *beep* *boop*
---


##
[1.60.0-rc.3](v1.59.0...v1.60.0-rc.3)
(2025-09-30)


### Features

* **router:** jobsdb pickup query batch size adapting to throttling
limit ([#6338](#6338))
([c38b971](c38b971))


### Bug Fixes

* add outgoing metrics to proxy flow
([#6355](#6355))
([23a5c81](23a5c81))
* aws session config region
([#6354](#6354))
([23a5c81](23a5c81))
* dedup gauge
([#6359](#6359))
([a562df4](a562df4))
* keydb consistent hashing
([#6403](#6403))
([3ee79b2](3ee79b2))
* keydb grpc config
([#6370](#6370))
([cf04743](cf04743))
* missing keydb client stats
([#6360](#6360))
([c42e5c8](c42e5c8))
* naming collision in redis throttling configuration
([#6365](#6365))
([4f87afb](4f87afb))
* set table type to external for glue
([#6386](#6386))
([9017582](9017582))
* ut mirroring tests
([#6341](#6341))
([51dd4f7](51dd4f7))
* **warehouse:** alter namespace col size
([#6379](#6379))
([fbba498](fbba498))
* **warehouse:** skip extract async job failing test
([#6378](#6378))
([89a4d1e](89a4d1e))


### Miscellaneous

* add authentication to reporting client
([#6384](#6384))
([502d2b0](502d2b0))
* add explicit permissions for workflows
([#6381](#6381))
([1d22623](1d22623))
* add readme for async destinaiton module
([#6356](#6356))
([cc80333](cc80333))
* configurable event name trimming for reporting
([#6394](#6394))
([7628efa](7628efa))
* remove deprecated throttling configuration keys
([#6377](#6377))
([8d667b7](8d667b7))
* upgrade build-scan-push-action to v1.8.0
([#6350](#6350))
([51dd4f7](51dd4f7))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
mihir20 pushed a commit that referenced this pull request Oct 1, 2025
🤖 I have created a release *beep* *boop*
---


##
[1.60.0](v1.59.0...v1.60.0)
(2025-09-30)


### Features

* **router:** jobsdb pickup query batch size adapting to throttling
limit ([#6338](#6338))
([c38b971](c38b971))


### Bug Fixes

* add outgoing metrics to proxy flow
([#6355](#6355))
([23a5c81](23a5c81))
* aws session config region
([#6354](#6354))
([23a5c81](23a5c81))
* dedup gauge
([#6359](#6359))
([a562df4](a562df4))
* keydb consistent hashing
([#6403](#6403))
([3ee79b2](3ee79b2))
* keydb grpc config
([#6370](#6370))
([cf04743](cf04743))
* missing keydb client stats
([#6360](#6360))
([c42e5c8](c42e5c8))
* naming collision in redis throttling configuration
([#6365](#6365))
([4f87afb](4f87afb))
* set table type to external for glue
([#6386](#6386))
([9017582](9017582))
* ut mirroring tests
([#6341](#6341))
([51dd4f7](51dd4f7))
* **warehouse:** alter namespace col size
([#6379](#6379))
([fbba498](fbba498))
* **warehouse:** skip extract async job failing test
([#6378](#6378))
([89a4d1e](89a4d1e))


### Miscellaneous

* add authentication to reporting client
([#6384](#6384))
([502d2b0](502d2b0))
* add explicit permissions for workflows
([#6381](#6381))
([1d22623](1d22623))
* add readme for async destinaiton module
([#6356](#6356))
([cc80333](cc80333))
* configurable event name trimming for reporting
([#6394](#6394))
([7628efa](7628efa))
* remove deprecated throttling configuration keys
([#6377](#6377))
([8d667b7](8d667b7))
* upgrade build-scan-push-action to v1.8.0
([#6350](#6350))
([51dd4f7](51dd4f7))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
mihir20 pushed a commit that referenced this pull request Oct 1, 2025
🤖 I have created a release *beep* *boop*
---


##
[1.60.0](v1.59.0...v1.60.0)
(2025-09-30)


### Features

* **router:** jobsdb pickup query batch size adapting to throttling
limit ([#6338](#6338))
([c38b971](c38b971))


### Bug Fixes

* add outgoing metrics to proxy flow
([#6355](#6355))
([23a5c81](23a5c81))
* aws session config region
([#6354](#6354))
([23a5c81](23a5c81))
* dedup gauge
([#6359](#6359))
([a562df4](a562df4))
* keydb consistent hashing
([#6403](#6403))
([3ee79b2](3ee79b2))
* keydb grpc config
([#6370](#6370))
([cf04743](cf04743))
* missing keydb client stats
([#6360](#6360))
([c42e5c8](c42e5c8))
* naming collision in redis throttling configuration
([#6365](#6365))
([4f87afb](4f87afb))
* set table type to external for glue
([#6386](#6386))
([9017582](9017582))
* ut mirroring tests
([#6341](#6341))
([51dd4f7](51dd4f7))
* **warehouse:** alter namespace col size
([#6379](#6379))
([fbba498](fbba498))
* **warehouse:** skip extract async job failing test
([#6378](#6378))
([89a4d1e](89a4d1e))


### Miscellaneous

* add authentication to reporting client
([#6384](#6384))
([502d2b0](502d2b0))
* add explicit permissions for workflows
([#6381](#6381))
([1d22623](1d22623))
* add readme for async destinaiton module
([#6356](#6356))
([cc80333](cc80333))
* configurable event name trimming for reporting
([#6394](#6394))
([7628efa](7628efa))
* remove deprecated throttling configuration keys
([#6377](#6377))
([8d667b7](8d667b7))
* upgrade build-scan-push-action to v1.8.0
([#6350](#6350))
([51dd4f7](51dd4f7))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants