Skip to content

Conversation

oscardssmith
Copy link
Member

@oscardssmith oscardssmith commented Oct 7, 2025

remove idxfloor and maxprobe, and shrink age to UInt32.

idxfloor is removed because the only thing it helps with is speeding up iterate calls (very slightly since it only speeds up the first call), and the speedup comes at a cost on insertions and deletions that happen to land near the front of the dict.

maxprobe is removed because if the hash function is good, we know that the Dict will spread values well, and if the hash function isn't good, rehashing won't fix anything, so we spend a lot of effort tracking maxprobe, but in most circumstances, it won't do anything, and in the circumstances where it does, it probably would be better off if it didn't.

Rebased and slightly more aggressive version of #48380 (rebasing was going to be more annoying since that PR was pre Memory).

@oscardssmith oscardssmith requested a review from petvana October 7, 2025 05:38
@oscardssmith oscardssmith added performance Must go faster collections Data structures holding multiple items, e.g. sets needs nanosoldier run This PR should have benchmarks run on it labels Oct 7, 2025
@oscardssmith oscardssmith changed the title remove idxfloor and maxprobe, shrink age Simplify Dict representation Oct 7, 2025
keys = h.keys

@assume_effects :terminates_locally :noub @inbounds while true
@assume_effects :terminates_locally :noub @inbounds for _ in 1:sz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these effects needed now that you have a fixed iteration instead of while true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, yes. the compiler doesn't know that fixed length iteration terminates

@KristofferC
Copy link
Member

Rebased and slightly more aggressive version of #48380 (rebasing was going to be more annoying since that PR was pre Memory).

FWIW, you can also just force push (instead of rebasing) to the existing PR to not lose the relevant discussion on it.

@oscardssmith
Copy link
Member Author

oscardssmith commented Oct 9, 2025

I talked to @vtjnash today, and he pointed out that the point of idxfloor is to prevent quadratic behavior when deleting all entries from a Dict via repeated pop! calls.

Given this, this PR as is isn't good as is. If we wanted to remove idxfloor, we would also have to sometimes shrink the dictionary on deletions if the count was too small (to e.g. ensure something like size(table)/8 < count < 4/5 *size(table)).

As such, this will some substantial benchmarking to determine whether the right approach is to remove idxfloor and resize on deletion, or whether idxfloor needs to stay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

collections Data structures holding multiple items, e.g. sets needs nanosoldier run This PR should have benchmarks run on it performance Must go faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants