Skip to content

feat: Add eviction based on rss memory #4991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

BagritsevichStepan
Copy link
Contributor

@BagritsevichStepan BagritsevichStepan commented Apr 24, 2025

This PR introduced eviction based on rss memory usage. It uses rss_oom_deny_ratio and eviction_memory_budget_threshold flags to evict items. The eviction starts when rss memory usage is above max_memory_limit * (rss_oom_deny_ratio - eviction_memory_budget_threshold) (I'll refer to this as the RSS memory usage threshold).

The implementation began with writing tests that reproduce scenarios with high RSS memory usage and low basic memory usage.

The first test inserts a large number of items and sets rss_oom_deny_ratio low enough to trigger eviction based on RSS memory usage and not on basic memory usage. It continues running until eviction stops — this is determined by checking whether the number of evicted keys remains unchanged over the last three seconds, which implies we've dropped below the RSS memory threshold. The test then performs a series of checks to ensure everything behaves correctly.

The second test is similar but includes two waves of data insertion. Initially, it follows the same steps as the first test. Then, once enough items have been evicted and the RSS memory usage is slightly below the threshold, it starts a second wave of insertions, pushing the usage above the threshold again. This second test was very useful in finding other bugs in the RSS eviction and memory defragmentation logic, which is why I believe it should be included.

During a short investigation in #4772, we discovered another issue: RSS memory usage was not decreasing even after evicting a sufficient number of items. The cause was in the memory defragmentation logic. The defragmentation thresholds and flags were not aligned with the eviction thresholds. For example, if the RSS eviction threshold is set to 80% of the max memory, and defragmentation is only triggered when RSS usage exceeds 90%, then even after we have evicted enough keys to bring memory usage below the rss threshold, we still won’t see a drop in actual RSS memory usage. This happens because defragmentation was never triggered — RSS usage never crossed the 90% threshold required to start it. To fix this, I introduced more aggressive defragmentation thresholds when eviction is enabled. The system now also considers the eviction RSS threshold when deciding whether to defragment. The only change in defragmentation is the use of different thresholds when eviction is enabled; the core logic remains the same.

Another important step was introducing the eviction_state_ in EngineShard, and ensuring it is taking in account during eviction decisions — this change is explained in detail in this comment.

// Calculate how much rss memory is used by all shards
const size_t global_used_rss_memory = rss_mem_current.load(memory_order_relaxed);

auto& global_rss_memory_at_prev_eviction = eviction_state_.global_rss_memory_at_prev_eviction;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eviction_state_ is used to track how much memory has already been evicted but not yet freed, meaning it hasn’t reduced rss_memory_usage yet. Once we observe a decrease in rss_memory_usage, we also reduce the corresponding amount of evicted bytes in eviction_state_.

This is necessary because rss_memory_usage updates with a delay, which can cause the eviction process to remove more items than actually needed. As shown in the tests, this approach may over-evict by about 5%, and for larger datastores, this percentage tends to be even lower. In comparison, the basic approach (you can see it here #5218) can over-evict by up to 18%.

stats_info = await async_client.info("stats")

assert memory_info["used_memory"] > max_memory * (
rss_oom_deny_ratio - eviction_memory_budget_threshold - 0.05
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you can see this 5% of memory that was over-evicted

@BagritsevichStepan BagritsevichStepan force-pushed the memory/rss-eviction branch 4 times, most recently from 378e597 to 925a09d Compare June 12, 2025 05:38
Signed-off-by: Stepan Bagritsevich <[email protected]>
@BagritsevichStepan BagritsevichStepan force-pushed the memory/rss-eviction branch 3 times, most recently from 4b31e2b to 17c35a6 Compare June 15, 2025 06:06
Signed-off-by: Stepan Bagritsevich <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants