Skip to content

Mark obsolete accounts on startup #6885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

roryharr
Copy link

@roryharr roryharr commented Jul 8, 2025

Problem

Summary of Changes

Fixes #

@roryharr roryharr force-pushed the mark_obsolete_accounts_on_startup branch from 28162ec to 49bae50 Compare July 9, 2025 21:54
@codecov-commenter
Copy link

codecov-commenter commented Jul 9, 2025

Codecov Report

Attention: Patch coverage is 97.10145% with 8 lines in your changes missing coverage. Please review.

Project coverage is 83.2%. Comparing base (ca5877f) to head (6f0d945).
Report is 14 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #6885     +/-   ##
=========================================
- Coverage    83.2%    83.2%   -0.1%     
=========================================
  Files         853      853             
  Lines      377591   377834    +243     
=========================================
+ Hits       314527   314723    +196     
- Misses      63064    63111     +47     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@roryharr
Copy link
Author

Used ledger with a panic after finishing verifying the accounts hash and timed loading the same snapshot 3 times with and without the change:

  | No Change | With Change | Percentage Change (+ is better)
Real Time | 339.0603333 | 351.1493333 | -3.565442139
User Time | 736.4596667 | 725.738 | 1.455838948
Sys Time | 1437.829333 | 1423.418667 | 1.002251542

It's a very very rough measure, but i don't think there is any real change.

@roryharr roryharr force-pushed the mark_obsolete_accounts_on_startup branch 2 times, most recently from a756b52 to 061903d Compare July 11, 2025 23:35
HandleReclaims::ProcessDeadSlots(&stats),
MarkAccountsObsolete::Yes(max_slot),
);
stats.report("reclaim_duplicates_index_generation", None);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Purge stats exist, figured they should at least be reported. I think they also have all the information we would want about what happens when obsolete accounts are marked.

It might make sense in the future to create a new type here, but seems like a future refactor to me.

@roryharr roryharr force-pushed the mark_obsolete_accounts_on_startup branch from 061903d to 6f0d945 Compare July 11, 2025 23:38
@@ -682,6 +714,7 @@ mod serde_snapshot_tests {
current_slot += 1;
assert_eq!(3, accounts.ref_count_for_pubkey(&pubkey1));
accounts.store_for_tests(current_slot, &[(&pubkey1, &zero_lamport_account)]);
accounts.store_for_tests(current_slot, &[(&pubkey3, &account2)]);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need pubkey3 added to the slot, as otherwise the slot will get entirely cleaned much earlier with obsolete accounts marked. This necessitates a shrink down below, but this test is still largely testing the same thing.

accounts.assert_load_account(current_slot, purged_pubkey2, 0);
// With accounts marked obsolete, accounts will be marked obsolete when deserializing which
// will make them not loadable
if accounts.mark_obsolete_accounts && restore {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed the restore flag here, as the behaviour is different before and after snapshot restore. On the after snapshot restore flow, the accounts are marked obsolete

|slot, account_info| {
// Since the unref makes every account a single ref account, all
// zero lamport accounts should be tracked as zero_lamport_single_ref
if account_info.is_zero_lamport() {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at doing this in visit_zero_lamport_pubkeys_during_startup but it is more efficient doing it here:
visit_zero_lamport_pubkeys_during_startup would need to walk the slot list to find the maximum slot and then determine whether it is zero lamports or not. clean_pubkey can just return the entry and it can be checked here.

@roryharr roryharr requested a review from Copilot July 11, 2025 23:43
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces an experimental obsolete-account tracking feature that marks and cleans out-of-date account entries on startup and during index generation.

  • Add a mark_obsolete_accounts flag to AccountsDbConfig and carry it through AccountsDb
  • Integrate conditional obsolete-account skipping in hashing, snapshot verification, and index generation
  • Extend AccountMapEntry with unref_by_count and add batch unref APIs in AccountsIndex
  • Update serde-snapshot tests and add new tests for cleaning/unref logic

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
runtime/src/serde_snapshot/tests.rs Update tests to expect different storage loads based on mark_obsolete_accounts
accounts-db/src/accounts_index/account_map_entry.rs Implement unref_by_count and update unref to support bulk decrement
accounts-db/src/accounts_index.rs Add clean_and_unref_slot_list and clean_and_unref_rooted_entries_by_bin with tests
accounts-db/src/accounts_db.rs Introduce mark_obsolete_accounts config flag and integrate it into hashing, snapshot, and index generation
Comments suppressed due to low confidence (1)

runtime/src/serde_snapshot/tests.rs:477

  • [nitpick] The boolean parameter restore is ambiguous in controlling post-restore assertions. Consider renaming it to something more descriptive like after_restore or simulate_restore to clarify its purpose.
    fn with_chained_zero_lamport_accounts<F>(f: F, restore: bool)

// to calculate the duplicates lt hash.
if should_calculate_duplicates_lt_hash && outer_duplicates_lt_hash.is_none() {
if populate_duplicates_lt_hash && outer_duplicates_lt_hash.is_none() {
Copy link
Preview

Copilot AI Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final guard uses populate_duplicates_lt_hash instead of should_calculate_duplicates_lt_hash, so when mark_obsolete_accounts is true, you still inject a default duplicates lt hash. Change the condition to use should_calculate_duplicates_lt_hash to honor the obsolete-account flag consistently.

Suggested change
if populate_duplicates_lt_hash && outer_duplicates_lt_hash.is_none() {
if should_calculate_duplicates_lt_hash && outer_duplicates_lt_hash.is_none() {

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants