-
Notifications
You must be signed in to change notification settings - Fork 589
Do not hash accounts when calculating capitalization #6867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not hash accounts when calculating capitalization #6867
Conversation
d6d9a5b
to
bde1b4d
Compare
accounts-db/src/accounts_db.rs
Outdated
pub fn calculate_capitalization_at_startup_from_index( | ||
&self, | ||
ancestors: &Ancestors, | ||
startup_slot: Slot, | ||
) -> u64 { | ||
self.accounts_index | ||
.account_maps | ||
.par_iter() | ||
.map(|accounts_index_bin| { | ||
accounts_index_bin | ||
.keys() | ||
.into_iter() | ||
.map(|pubkey| { | ||
self.accounts_index | ||
.get_with_and_then( | ||
&pubkey, | ||
Some(ancestors), | ||
Some(startup_slot), | ||
false, | ||
|(slot, account_info)| { | ||
(!account_info.is_zero_lamport()).then(|| { | ||
self.get_account_accessor( | ||
slot, | ||
&pubkey, | ||
&account_info.storage_location(), | ||
) | ||
.get_loaded_account(|loaded_account| { | ||
loaded_account.lamports() | ||
}) | ||
// SAFETY: The index said this pubkey exists, so | ||
// there must be an account to load. | ||
.unwrap() | ||
}) | ||
}, | ||
) | ||
.flatten() | ||
.unwrap_or(0) | ||
}) | ||
.sum::<u64>() | ||
}) | ||
.sum() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the main change. The impl mirrors calculate_accounts_lt_hash_at_startup_from_index()
, which is two functions above.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6867 +/- ##
=======================================
Coverage 83.3% 83.3%
=======================================
Files 853 853
Lines 378386 378440 +54
=======================================
+ Hits 315324 315377 +53
- Misses 63062 63063 +1 🚀 New features to boost your workflow:
|
/// account-by-account, summing each account's balance. | ||
/// | ||
/// Only intended to be called at startup by ledger-tool or tests. | ||
/// (cannot be made DCOU due to snapshot-minimizer) | ||
pub fn set_capitalization(&self) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to do in this PR, but please rename this function! Made this change more confusing then it needed to be.
maybe recalculate_capitalization?
recalculate_and_update_capitalization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100%. It took everything for me to not refactor it all immediately within this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #6883.
&pubkey, | ||
&account_info.storage_location(), | ||
) | ||
.get_loaded_account(|loaded_account| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't get_loaded_account get data as well?
What is the speedup if you create an accessor that skips loading data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this impl does currently load account data. There's not a way with LoadedAccountAccessor to not load the account data, so it'll be a bigger change.
I propose that we keep this PR as-is, and then have a follow-up for loading accounts without data and using it here. This will keep both PRs smaller/focused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
@@ -6322,9 +6324,10 @@ impl AccountsDb { | |||
.flatten() | |||
.unwrap_or(0) | |||
}) | |||
.sum::<u64>() | |||
.try_fold(0, u64::checked_add) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this PR, we no longer call update_accounts_hash_with_verify
in ledger-tool.
Is that expected that we don't need to check accounts hash in ledger-tool? or the accounts hash is verified in other place than calculating the capital?
Good eye! When creating a snapshot with ledger-tool, the accounts hash is also calculated as part of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
Problem
When creating a snapshot with ledger-tool, we (re)calculate the capitalization. Under the hood this uses the merkle-based accounts hash calculation. This has the following problems:
It's slow. We don't need to hash each account, nor load account data from disk, nor build the merkle tree, nor compute the merkle root.
Once all the lattice hash features are enabled, there will be no more merkle-based accounts hash calculations. At that time we'd like to remove the old code. We cannot remove the code yet, since calculating capitalization uses it.
Summary of Changes
Reimplement
calculate_capitalization()
to not use any of the merkle-based accounts hash calculation code.Results
I ran
ledger-tool capitalization
against a mnb snapshot to time how long it took to calculate the capitalization with this pr vs master: