-
Notifications
You must be signed in to change notification settings - Fork 660
perf(retention): improve performance of exemplars removal #1018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report
@@ Coverage Diff @@
## main #1018 +/- ##
==========================================
+ Coverage 70.11% 70.13% +0.03%
==========================================
Files 84 83 -1
Lines 2940 2926 -14
Branches 605 601 -4
==========================================
- Hits 2061 2052 -9
+ Misses 876 871 -5
Partials 3 3
Continue to review full report at Codecov.
|
size-limit report 📦
|
6479953
to
b9fb64e
Compare
b9fb64e
to
4e8cfd8
Compare
thisisobate
pushed a commit
that referenced
this pull request
Apr 16, 2022
* Revert "fix: allow cache eviction and write-back while purging storage (#962)" This reverts commit cad1afc. * fix(exemplars): optimise exemplars removal * perf(exemplars): batch insert * perf(ingestion): adjust storage queue size * perf(ingestion): make exemplars to handle batches queue during data removal * fix tests
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
performance
If there's anything we have to be really good at it's this
storage
Low level storage matters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We observe significant slowdown when retention policy is configured and the server with historical data (presumable) runs for the first time - as a result, this has a large impact on overall performance manifesting in data losses and inability to query data.
pyroscope_storage_exemplars_read_bytes
pyroscope_storage_exemplars_write_bytes
pyroscope_storage_exemplars_removed_total
pyroscope_storage_exemplars_retention_task_duration_seconds
pyroscope_storage_discarded_total
Load testing shows that Pyroscope server is only able to handle thousands of exemplars per second, which might be insufficient. The bandwidth can be increased significantly, if we tune the internal storage queue - we may either implement some sort of throttling, or allow users to configure its size explicitly – now it's size is hardcoded to 100. As a side effect, pyroscope server will rejects less data, creating more pressure on the storage subsystem (retention task + GC).
In the picture below results of synthetic tests: ~300K exemplars, pull mode - Diff View before and after:

Note: stacks marked as added simply because their proportion to the total has increased, not the absolute values - I think it would be more useful, if we either normalised the values, or let users choose.
However, the cost of deleting obsolete data (outside the retention period) is still a concern as well as BadgerDB GC that invokes noticeable spike of I/O and CPU consumption. Even after the changes from this PR, GC will consume resources comparable to ones required to ingest the data: underlying WiscKey LSM is optimised for writes (like an append only log), and a delegation operation is treat as a write operation. Furthermore, because of how BadgerDB GC is implemented, in practice, it may be even more costly: when a value log is getting rewritten, data locality plays important role. Given that the
profile_id
(span ID, basically a random string) is part of the key, we have a pretty poor data locality. As a result, at some point, GC may need to rewrite the whole database (all its value log files).For example, the profile below depicts the CPU time consumed by a GC job that "compacted" exemplars DB from 1GB to ~500MB:

A subtle improvement (apart from the beneficial effect of write batches) has been achieved by forcing BadgerDB compaction before starting removing data, this decreased CPU consumption by 15%-20% according to the tests.
Future work:
profile_id
(instead of span IDs). This should drastically improve data locality.