Skip to content

Implement LFU sketch using arm64 intrinsics (redux) #648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Dec 1, 2024

Conversation

bitfaster
Copy link
Owner

@bitfaster bitfaster commented Nov 26, 2024

Below the *BlockAvx benchmark runs on the ARM64 intrinsics, FlatAVX, BlockAVXNotPinned are not implemented. Supersedes #595.

3a5a72c

BenchmarkDotNet v0.14.0, macOS Sonoma 14.5 (23F79) [Darwin 23.5.0]
Apple M2, 1 CPU, 8 logical and 8 physical cores
.NET SDK 9.0.100
  [Host]   : .NET 6.0.30 (6.0.3024.21525), Arm64 RyuJIT AdvSIMD
  .NET 6.0 : .NET 6.0.30 (6.0.3024.21525), Arm64 RyuJIT AdvSIMD
  .NET 8.0 : .NET 8.0.0 (8.0.23.53103), Arm64 RyuJIT AdvSIMD
  .NET 9.0 : .NET 9.0.0 (9.0.24.52809), Arm64 RyuJIT AdvSIMD

BitFaster Caching Benchmarks Lfu SketchFrequency-columnchart

BitFaster Caching Benchmarks Lfu SketchIncrement-columnchart

3a5a72c

BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.2314)
Cobalt 100
.NET SDK 9.0.100
  [Host]   : .NET 6.0.36 (6.0.3624.51421), Arm64 RyuJIT AdvSIMD
  .NET 6.0 : .NET 6.0.36 (6.0.3624.51421), Arm64 RyuJIT AdvSIMD
  .NET 8.0 : .NET 8.0.11 (8.0.1124.51707), Arm64 RyuJIT AdvSIMD
  .NET 9.0 : .NET 9.0.0 (9.0.24.52809), Arm64 RyuJIT AdvSIMD

BitFaster Caching Benchmarks Lfu SketchIncrement-columnchart

BitFaster Caching Benchmarks Lfu SketchFrequency-columnchart

TODO:

  • Measure Mac M series CPU
  • Rename AVX benchmarks to Vectorized (assuming it is also worse on M2)
  • Delete 512 (Aggressive Optimization) code path/bench
  • Define constants in the .csproj file for x64 vs Arm64, and conditionally include benchmarks (e.g. don't run the flatavx bench on arm)

@bitfaster
Copy link
Owner Author

3a5a72c

BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.2314)
Unknown processor
.NET SDK 9.0.100
  [Host]   : .NET 6.0.36 (6.0.3624.51421), Arm64 RyuJIT AdvSIMD
  .NET 6.0 : .NET 6.0.36 (6.0.3624.51421), Arm64 RyuJIT AdvSIMD
  .NET 8.0 : .NET 8.0.11 (8.0.1124.51707), Arm64 RyuJIT AdvSIMD
  .NET 9.0 : .NET 9.0.0 (9.0.24.52809), Arm64 RyuJIT AdvSIMD
Method Runtime Size Mean Error StdDev Ratio Allocated
FrequencyFlat .NET 6.0 32768 38.22 ns 0.018 ns 0.017 ns 1.00 -
FrequencyFlatAvx .NET 6.0 32768 38.18 ns 0.015 ns 0.014 ns 1.00 -
FrequencyBlock .NET 6.0 32768 25.92 ns 0.014 ns 0.013 ns 0.68 -
FrequencyBlockAvxNotPinned .NET 6.0 32768 28.09 ns 0.112 ns 0.105 ns 0.73 -
FrequencyBlockAvxPinned .NET 6.0 32768 22.19 ns 0.003 ns 0.002 ns 0.58 -
FrequencyBlockAvxPinned512 .NET 6.0 32768 24.35 ns 0.054 ns 0.048 ns 0.64 -
FrequencyFlat .NET 8.0 32768 17.28 ns 0.003 ns 0.003 ns 1.00 -
FrequencyFlatAvx .NET 8.0 32768 17.31 ns 0.003 ns 0.002 ns 1.00 -
FrequencyBlock .NET 8.0 32768 20.61 ns 0.006 ns 0.006 ns 1.19 -
FrequencyBlockAvxNotPinned .NET 8.0 32768 19.21 ns 0.009 ns 0.009 ns 1.11 -
FrequencyBlockAvxPinned .NET 8.0 32768 16.05 ns 0.003 ns 0.002 ns 0.93 -
FrequencyBlockAvxPinned512 .NET 8.0 32768 17.80 ns 0.006 ns 0.006 ns 1.03 -
FrequencyFlat .NET 9.0 32768 16.13 ns 0.004 ns 0.003 ns 1.00 -
FrequencyFlatAvx .NET 9.0 32768 16.16 ns 0.004 ns 0.004 ns 1.00 -
FrequencyBlock .NET 9.0 32768 21.07 ns 0.009 ns 0.009 ns 1.31 -
FrequencyBlockAvxNotPinned .NET 9.0 32768 19.99 ns 0.076 ns 0.071 ns 1.24 -
FrequencyBlockAvxPinned .NET 9.0 32768 15.93 ns 0.003 ns 0.002 ns 0.99 -
FrequencyBlockAvxPinned512 .NET 9.0 32768 17.90 ns 0.007 ns 0.006 ns 1.11 -
FrequencyFlat .NET 6.0 524288 73.66 ns 1.467 ns 1.959 ns 1.00 -
FrequencyFlatAvx .NET 6.0 524288 72.76 ns 1.013 ns 0.898 ns 0.99 -
FrequencyBlock .NET 6.0 524288 51.06 ns 0.964 ns 0.805 ns 0.69 -
FrequencyBlockAvxNotPinned .NET 6.0 524288 64.36 ns 1.268 ns 2.589 ns 0.87 -
FrequencyBlockAvxPinned .NET 6.0 524288 51.99 ns 1.003 ns 1.405 ns 0.71 -
FrequencyBlockAvxPinned512 .NET 6.0 524288 58.34 ns 1.129 ns 1.255 ns 0.79 -
FrequencyFlat .NET 8.0 524288 52.27 ns 1.045 ns 2.678 ns 1.00 -
FrequencyFlatAvx .NET 8.0 524288 51.07 ns 0.970 ns 2.622 ns 0.98 -
FrequencyBlock .NET 8.0 524288 47.73 ns 0.947 ns 1.848 ns 0.92 -
FrequencyBlockAvxNotPinned .NET 8.0 524288 50.64 ns 0.742 ns 1.016 ns 0.97 -
FrequencyBlockAvxPinned .NET 8.0 524288 36.97 ns 0.686 ns 0.762 ns 0.71 -
FrequencyBlockAvxPinned512 .NET 8.0 524288 51.10 ns 0.983 ns 1.132 ns 0.98 -
FrequencyFlat .NET 9.0 524288 49.57 ns 0.989 ns 2.481 ns 1.00 -
FrequencyFlatAvx .NET 9.0 524288 47.64 ns 0.944 ns 1.413 ns 0.96 -
FrequencyBlock .NET 9.0 524288 47.22 ns 0.858 ns 0.760 ns 0.96 -
FrequencyBlockAvxNotPinned .NET 9.0 524288 54.75 ns 1.084 ns 1.927 ns 1.11 -
FrequencyBlockAvxPinned .NET 9.0 524288 37.79 ns 0.724 ns 0.678 ns 0.76 -
FrequencyBlockAvxPinned512 .NET 9.0 524288 52.60 ns 1.041 ns 2.495 ns 1.06 -
FrequencyFlat .NET 6.0 8388608 99.03 ns 2.476 ns 7.301 ns 1.01 -
FrequencyFlatAvx .NET 6.0 8388608 97.94 ns 2.441 ns 7.197 ns 0.99 -
FrequencyBlock .NET 6.0 8388608 78.16 ns 2.012 ns 5.934 ns 0.79 -
FrequencyBlockAvxNotPinned .NET 6.0 8388608 112.47 ns 2.363 ns 6.929 ns 1.14 -
FrequencyBlockAvxPinned .NET 6.0 8388608 76.69 ns 1.657 ns 4.886 ns 0.78 -
FrequencyBlockAvxPinned512 .NET 6.0 8388608 109.02 ns 2.202 ns 6.459 ns 1.11 -
FrequencyFlat .NET 8.0 8388608 72.30 ns 1.951 ns 5.753 ns 1.01 -
FrequencyFlatAvx .NET 8.0 8388608 71.47 ns 1.876 ns 5.503 ns 1.00 -
FrequencyBlock .NET 8.0 8388608 72.92 ns 2.140 ns 6.311 ns 1.02 -
FrequencyBlockAvxNotPinned .NET 8.0 8388608 113.58 ns 3.929 ns 11.585 ns 1.58 -
FrequencyBlockAvxPinned .NET 8.0 8388608 63.84 ns 1.853 ns 5.464 ns 0.89 -
FrequencyBlockAvxPinned512 .NET 8.0 8388608 100.34 ns 2.204 ns 6.499 ns 1.40 -
FrequencyFlat .NET 9.0 8388608 69.13 ns 1.789 ns 5.276 ns 1.01 -
FrequencyFlatAvx .NET 9.0 8388608 70.01 ns 1.976 ns 5.764 ns 1.02 -
FrequencyBlock .NET 9.0 8388608 70.88 ns 1.549 ns 4.568 ns 1.03 -
FrequencyBlockAvxNotPinned .NET 9.0 8388608 102.95 ns 2.057 ns 5.969 ns 1.50 -
FrequencyBlockAvxPinned .NET 9.0 8388608 63.16 ns 1.797 ns 5.300 ns 0.92 -
FrequencyBlockAvxPinned512 .NET 9.0 8388608 102.38 ns 2.200 ns 6.488 ns 1.49 -
FrequencyFlat .NET 6.0 134217728 208.48 ns 3.770 ns 3.526 ns 1.00 -
FrequencyFlatAvx .NET 6.0 134217728 210.55 ns 4.146 ns 4.774 ns 1.01 -
FrequencyBlock .NET 6.0 134217728 180.17 ns 3.503 ns 4.555 ns 0.86 -
FrequencyBlockAvxNotPinned .NET 6.0 134217728 193.68 ns 3.332 ns 2.954 ns 0.93 -
FrequencyBlockAvxPinned .NET 6.0 134217728 171.13 ns 3.415 ns 5.706 ns 0.82 -
FrequencyBlockAvxPinned512 .NET 6.0 134217728 188.50 ns 2.899 ns 2.570 ns 0.90 -
FrequencyFlat .NET 8.0 134217728 184.16 ns 1.339 ns 1.045 ns 1.00 -
FrequencyFlatAvx .NET 8.0 134217728 185.28 ns 3.439 ns 3.048 ns 1.01 -
FrequencyBlock .NET 8.0 134217728 171.66 ns 3.417 ns 5.008 ns 0.93 -
FrequencyBlockAvxNotPinned .NET 8.0 134217728 183.36 ns 1.997 ns 1.668 ns 1.00 -
FrequencyBlockAvxPinned .NET 8.0 134217728 165.15 ns 3.192 ns 3.278 ns 0.90 -
FrequencyBlockAvxPinned512 .NET 8.0 134217728 181.27 ns 1.955 ns 1.920 ns 0.98 -
FrequencyFlat .NET 9.0 134217728 183.10 ns 3.178 ns 2.972 ns 1.00 -
FrequencyFlatAvx .NET 9.0 134217728 181.88 ns 1.724 ns 1.440 ns 0.99 -
FrequencyBlock .NET 9.0 134217728 173.77 ns 3.460 ns 4.119 ns 0.95 -
FrequencyBlockAvxNotPinned .NET 9.0 134217728 184.07 ns 3.414 ns 3.353 ns 1.01 -
FrequencyBlockAvxPinned .NET 9.0 134217728 163.86 ns 3.226 ns 4.627 ns 0.90 -
FrequencyBlockAvxPinned512 .NET 9.0 134217728 187.41 ns 3.671 ns 3.770 ns 1.02 -
Method Runtime Size Mean Error StdDev Ratio Allocated
IncFlat .NET 6.0 32768 22.84 ns 0.022 ns 0.021 ns 1.00 -
IncFlatAvx .NET 6.0 32768 22.84 ns 0.004 ns 0.004 ns 1.00 -
IncBlock .NET 6.0 32768 20.03 ns 0.007 ns 0.007 ns 0.88 -
IncBlockAvxNotPinned .NET 6.0 32768 19.37 ns 0.007 ns 0.007 ns 0.85 -
IncBlockAvxPinned .NET 6.0 32768 13.79 ns 0.004 ns 0.004 ns 0.60 -
IncBlockAvxPinned512 .NET 6.0 32768 13.69 ns 0.002 ns 0.001 ns 0.60 -
IncFlat .NET 8.0 32768 12.13 ns 0.014 ns 0.013 ns 1.00 -
IncFlatAvx .NET 8.0 32768 12.10 ns 0.009 ns 0.008 ns 1.00 -
IncBlock .NET 8.0 32768 17.01 ns 0.021 ns 0.020 ns 1.40 -
IncBlockAvxNotPinned .NET 8.0 32768 16.86 ns 0.017 ns 0.016 ns 1.39 -
IncBlockAvxPinned .NET 8.0 32768 11.34 ns 0.004 ns 0.003 ns 0.93 -
IncBlockAvxPinned512 .NET 8.0 32768 12.38 ns 0.002 ns 0.002 ns 1.02 -
IncFlat .NET 9.0 32768 12.05 ns 0.013 ns 0.012 ns 1.00 -
IncFlatAvx .NET 9.0 32768 11.99 ns 0.002 ns 0.002 ns 1.00 -
IncBlock .NET 9.0 32768 17.20 ns 0.030 ns 0.028 ns 1.43 -
IncBlockAvxNotPinned .NET 9.0 32768 17.11 ns 0.034 ns 0.032 ns 1.42 -
IncBlockAvxPinned .NET 9.0 32768 11.50 ns 0.001 ns 0.001 ns 0.95 -
IncBlockAvxPinned512 .NET 9.0 32768 12.95 ns 0.019 ns 0.016 ns 1.07 -
IncFlat .NET 6.0 524288 57.27 ns 0.982 ns 1.344 ns 1.00 -
IncFlatAvx .NET 6.0 524288 56.64 ns 1.013 ns 1.664 ns 0.99 -
IncBlock .NET 6.0 524288 45.69 ns 0.910 ns 1.246 ns 0.80 -
IncBlockAvxNotPinned .NET 6.0 524288 48.16 ns 0.858 ns 1.203 ns 0.84 -
IncBlockAvxPinned .NET 6.0 524288 29.27 ns 0.575 ns 0.591 ns 0.51 -
IncBlockAvxPinned512 .NET 6.0 524288 28.65 ns 0.324 ns 0.253 ns 0.50 -
IncFlat .NET 8.0 524288 34.49 ns 0.688 ns 0.845 ns 1.00 -
IncFlatAvx .NET 8.0 524288 34.70 ns 0.576 ns 0.640 ns 1.01 -
IncBlock .NET 8.0 524288 41.47 ns 0.761 ns 0.712 ns 1.20 -
IncBlockAvxNotPinned .NET 8.0 524288 40.71 ns 0.643 ns 0.537 ns 1.18 -
IncBlockAvxPinned .NET 8.0 524288 23.68 ns 0.453 ns 0.354 ns 0.69 -
IncBlockAvxPinned512 .NET 8.0 524288 27.18 ns 0.538 ns 0.598 ns 0.79 -
IncFlat .NET 9.0 524288 34.29 ns 0.291 ns 0.258 ns 1.00 -
IncFlatAvx .NET 9.0 524288 31.31 ns 0.617 ns 0.844 ns 0.91 -
IncBlock .NET 9.0 524288 41.22 ns 0.641 ns 0.535 ns 1.20 -
IncBlockAvxNotPinned .NET 9.0 524288 40.52 ns 0.723 ns 0.641 ns 1.18 -
IncBlockAvxPinned .NET 9.0 524288 25.17 ns 0.192 ns 0.160 ns 0.73 -
IncBlockAvxPinned512 .NET 9.0 524288 27.31 ns 0.533 ns 0.920 ns 0.80 -
IncFlat .NET 6.0 8388608 136.95 ns 6.262 ns 18.464 ns 1.02 -
IncFlatAvx .NET 6.0 8388608 135.67 ns 6.962 ns 20.526 ns 1.01 -
IncBlock .NET 6.0 8388608 113.30 ns 5.736 ns 16.914 ns 0.85 -
IncBlockAvxNotPinned .NET 6.0 8388608 91.95 ns 7.767 ns 22.901 ns 0.69 -
IncBlockAvxPinned .NET 6.0 8388608 52.28 ns 5.102 ns 15.044 ns 0.39 -
IncBlockAvxPinned512 .NET 6.0 8388608 38.78 ns 0.961 ns 2.727 ns 0.29 -
IncFlat .NET 8.0 8388608 47.20 ns 4.437 ns 12.944 ns 1.06 -
IncFlatAvx .NET 8.0 8388608 40.78 ns 1.239 ns 3.654 ns 0.92 -
IncBlock .NET 8.0 8388608 62.09 ns 1.383 ns 4.077 ns 1.40 -
IncBlockAvxNotPinned .NET 8.0 8388608 61.79 ns 1.685 ns 4.941 ns 1.39 -
IncBlockAvxPinned .NET 8.0 8388608 32.76 ns 0.618 ns 0.661 ns 0.74 -
IncBlockAvxPinned512 .NET 8.0 8388608 33.95 ns 0.658 ns 1.220 ns 0.76 -
IncFlat .NET 9.0 8388608 37.49 ns 0.995 ns 2.902 ns 1.01 -
IncFlatAvx .NET 9.0 8388608 36.86 ns 0.849 ns 2.491 ns 0.99 -
IncBlock .NET 9.0 8388608 63.20 ns 1.954 ns 5.761 ns 1.70 -
IncBlockAvxNotPinned .NET 9.0 8388608 58.93 ns 1.289 ns 3.781 ns 1.58 -
IncBlockAvxPinned .NET 9.0 8388608 32.77 ns 0.628 ns 0.839 ns 0.88 -
IncBlockAvxPinned512 .NET 9.0 8388608 33.70 ns 0.674 ns 0.692 ns 0.90 -
IncFlat .NET 6.0 134217728 188.73 ns 2.793 ns 2.181 ns 1.00 -
IncFlatAvx .NET 6.0 134217728 190.12 ns 3.407 ns 3.187 ns 1.01 -
IncBlock .NET 6.0 134217728 167.86 ns 3.340 ns 5.488 ns 0.89 -
IncBlockAvxNotPinned .NET 6.0 134217728 166.53 ns 3.183 ns 3.269 ns 0.88 -
IncBlockAvxPinned .NET 6.0 134217728 79.92 ns 1.571 ns 2.711 ns 0.42 -
IncBlockAvxPinned512 .NET 6.0 134217728 80.89 ns 1.603 ns 3.519 ns 0.43 -
IncFlat .NET 8.0 134217728 96.57 ns 1.165 ns 1.033 ns 1.00 -
IncFlatAvx .NET 8.0 134217728 95.91 ns 1.268 ns 1.059 ns 0.99 -
IncBlock .NET 8.0 134217728 145.32 ns 2.880 ns 5.817 ns 1.50 -
IncBlockAvxNotPinned .NET 8.0 134217728 142.88 ns 2.847 ns 5.206 ns 1.48 -
IncBlockAvxPinned .NET 8.0 134217728 80.75 ns 1.580 ns 2.215 ns 0.84 -
IncBlockAvxPinned512 .NET 8.0 134217728 81.10 ns 1.617 ns 2.657 ns 0.84 -
IncFlat .NET 9.0 134217728 96.29 ns 1.695 ns 1.502 ns 1.00 -
IncFlatAvx .NET 9.0 134217728 96.15 ns 1.899 ns 2.261 ns 1.00 -
IncBlock .NET 9.0 134217728 146.08 ns 2.899 ns 5.988 ns 1.52 -
IncBlockAvxNotPinned .NET 9.0 134217728 145.51 ns 2.898 ns 5.721 ns 1.51 -
IncBlockAvxPinned .NET 9.0 134217728 79.89 ns 1.597 ns 2.239 ns 0.83 -
IncBlockAvxPinned512 .NET 9.0 134217728 79.17 ns 1.563 ns 2.974 ns 0.82 -

@coveralls
Copy link

coveralls commented Nov 26, 2024

Coverage Status

coverage: 99.228% (+0.01%) from 99.218%
when pulling 7780042 on users/alexpeck/neon
into d8ac6f4 on main.

@bitfaster
Copy link
Owner Author

bitfaster commented Nov 26, 2024

BenchmarkDotNet v0.14.0, macOS Sonoma 14.5 (23F79) [Darwin 23.5.0]
Apple M2, 1 CPU, 8 logical and 8 physical cores
.NET SDK 9.0.100
  [Host]   : .NET 6.0.30 (6.0.3024.21525), Arm64 RyuJIT AdvSIMD
  .NET 6.0 : .NET 6.0.30 (6.0.3024.21525), Arm64 RyuJIT AdvSIMD
  .NET 8.0 : .NET 8.0.0 (8.0.23.53103), Arm64 RyuJIT AdvSIMD
  .NET 9.0 : .NET 9.0.0 (9.0.24.52809), Arm64 RyuJIT AdvSIMD

BitFaster Caching Benchmarks Lfu SketchFrequency-columnchart

BitFaster Caching Benchmarks Lfu SketchIncrement-columnchart

Frequency
Method Runtime Size Mean Error StdDev Ratio Allocated
FrequencyFlat .NET 6.0 32768 22.059 ns 0.0827 ns 0.0690 ns 1.00 -
FrequencyFlatAvx .NET 6.0 32768 22.101 ns 0.1068 ns 0.0892 ns 1.00 -
FrequencyBlock .NET 6.0 32768 19.297 ns 0.0380 ns 0.0337 ns 0.87 -
FrequencyBlockAvxNotPinned .NET 6.0 32768 15.872 ns 0.1189 ns 0.1112 ns 0.72 -
FrequencyBlockAvxPinned .NET 6.0 32768 10.762 ns 0.0353 ns 0.0330 ns 0.49 -
FrequencyBlockAvxPinned512 .NET 6.0 32768 10.983 ns 0.0389 ns 0.0363 ns 0.50 -
FrequencyFlat .NET 8.0 32768 11.356 ns 0.0285 ns 0.0253 ns 1.00 -
FrequencyFlatAvx .NET 8.0 32768 11.365 ns 0.0395 ns 0.0351 ns 1.00 -
FrequencyBlock .NET 8.0 32768 13.604 ns 0.0480 ns 0.0449 ns 1.20 -
FrequencyBlockAvxNotPinned .NET 8.0 32768 11.175 ns 0.0201 ns 0.0168 ns 0.98 -
FrequencyBlockAvxPinned .NET 8.0 32768 8.508 ns 0.0287 ns 0.0254 ns 0.75 -
FrequencyBlockAvxPinned512 .NET 8.0 32768 9.308 ns 0.0340 ns 0.0318 ns 0.82 -
FrequencyFlat .NET 9.0 32768 10.644 ns 0.0095 ns 0.0074 ns 1.00 -
FrequencyFlatAvx .NET 9.0 32768 10.647 ns 0.0095 ns 0.0079 ns 1.00 -
FrequencyBlock .NET 9.0 32768 13.655 ns 0.0414 ns 0.0387 ns 1.28 -
FrequencyBlockAvxNotPinned .NET 9.0 32768 11.125 ns 0.0442 ns 0.0413 ns 1.05 -
FrequencyBlockAvxPinned .NET 9.0 32768 8.428 ns 0.0317 ns 0.0281 ns 0.79 -
FrequencyBlockAvxPinned512 .NET 9.0 32768 9.236 ns 0.0508 ns 0.0424 ns 0.87 -
FrequencyFlat .NET 6.0 524288 22.093 ns 0.0817 ns 0.0638 ns 1.00 -
FrequencyFlatAvx .NET 6.0 524288 22.083 ns 0.0632 ns 0.0560 ns 1.00 -
FrequencyBlock .NET 6.0 524288 20.046 ns 0.0693 ns 0.0579 ns 0.91 -
FrequencyBlockAvxNotPinned .NET 6.0 524288 16.548 ns 0.0591 ns 0.0524 ns 0.75 -
FrequencyBlockAvxPinned .NET 6.0 524288 11.156 ns 0.1230 ns 0.1151 ns 0.50 -
FrequencyBlockAvxPinned512 .NET 6.0 524288 11.491 ns 0.0742 ns 0.0620 ns 0.52 -
FrequencyFlat .NET 8.0 524288 11.818 ns 0.0295 ns 0.0276 ns 1.00 -
FrequencyFlatAvx .NET 8.0 524288 11.826 ns 0.0249 ns 0.0220 ns 1.00 -
FrequencyBlock .NET 8.0 524288 14.051 ns 0.0393 ns 0.0367 ns 1.19 -
FrequencyBlockAvxNotPinned .NET 8.0 524288 12.193 ns 0.0395 ns 0.0330 ns 1.03 -
FrequencyBlockAvxPinned .NET 8.0 524288 8.796 ns 0.1091 ns 0.1020 ns 0.74 -
FrequencyBlockAvxPinned512 .NET 8.0 524288 9.950 ns 0.0604 ns 0.0565 ns 0.84 -
FrequencyFlat .NET 9.0 524288 11.075 ns 0.0349 ns 0.0309 ns 1.00 -
FrequencyFlatAvx .NET 9.0 524288 11.081 ns 0.0583 ns 0.0487 ns 1.00 -
FrequencyBlock .NET 9.0 524288 14.135 ns 0.0504 ns 0.0447 ns 1.28 -
FrequencyBlockAvxNotPinned .NET 9.0 524288 11.608 ns 0.0325 ns 0.0304 ns 1.05 -
FrequencyBlockAvxPinned .NET 9.0 524288 8.663 ns 0.0476 ns 0.0422 ns 0.78 -
FrequencyBlockAvxPinned512 .NET 9.0 524288 9.672 ns 0.0228 ns 0.0190 ns 0.87 -
FrequencyFlat .NET 6.0 8388608 103.039 ns 0.3225 ns 0.2859 ns 1.00 -
FrequencyFlatAvx .NET 6.0 8388608 103.361 ns 0.5517 ns 0.4607 ns 1.00 -
FrequencyBlock .NET 6.0 8388608 61.787 ns 0.1551 ns 0.1375 ns 0.60 -
FrequencyBlockAvxNotPinned .NET 6.0 8388608 66.924 ns 0.0626 ns 0.0489 ns 0.65 -
FrequencyBlockAvxPinned .NET 6.0 8388608 42.063 ns 0.0124 ns 0.0104 ns 0.41 -
FrequencyBlockAvxPinned512 .NET 6.0 8388608 52.921 ns 0.1088 ns 0.0965 ns 0.51 -
FrequencyFlat .NET 8.0 8388608 58.984 ns 0.1613 ns 0.1260 ns 1.00 -
FrequencyFlatAvx .NET 8.0 8388608 58.865 ns 0.2277 ns 0.2019 ns 1.00 -
FrequencyBlock .NET 8.0 8388608 57.932 ns 0.0222 ns 0.0174 ns 0.98 -
FrequencyBlockAvxNotPinned .NET 8.0 8388608 51.801 ns 0.0230 ns 0.0192 ns 0.88 -
FrequencyBlockAvxPinned .NET 8.0 8388608 33.021 ns 0.0227 ns 0.0189 ns 0.56 -
FrequencyBlockAvxPinned512 .NET 8.0 8388608 44.814 ns 0.0409 ns 0.0319 ns 0.76 -
FrequencyFlat .NET 9.0 8388608 52.521 ns 0.1665 ns 0.1558 ns 1.00 -
FrequencyFlatAvx .NET 9.0 8388608 56.248 ns 0.8676 ns 0.7244 ns 1.07 -
FrequencyBlock .NET 9.0 8388608 58.269 ns 0.1008 ns 0.0893 ns 1.11 -
FrequencyBlockAvxNotPinned .NET 9.0 8388608 50.008 ns 0.0206 ns 0.0161 ns 0.95 -
FrequencyBlockAvxPinned .NET 9.0 8388608 29.572 ns 0.0735 ns 0.0652 ns 0.56 -
FrequencyBlockAvxPinned512 .NET 9.0 8388608 43.279 ns 0.0349 ns 0.0292 ns 0.82 -
FrequencyFlat .NET 6.0 134217728 120.381 ns 0.4951 ns 0.4631 ns 1.00 -
FrequencyFlatAvx .NET 6.0 134217728 120.602 ns 0.8782 ns 0.8215 ns 1.00 -
FrequencyBlock .NET 6.0 134217728 72.510 ns 0.0615 ns 0.0513 ns 0.60 -
FrequencyBlockAvxNotPinned .NET 6.0 134217728 71.019 ns 0.0212 ns 0.0188 ns 0.59 -
FrequencyBlockAvxPinned .NET 6.0 134217728 47.984 ns 0.0424 ns 0.0376 ns 0.40 -
FrequencyBlockAvxPinned512 .NET 6.0 134217728 56.089 ns 0.0887 ns 0.0741 ns 0.47 -
FrequencyFlat .NET 8.0 134217728 66.739 ns 0.0646 ns 0.0573 ns 1.00 -
FrequencyFlatAvx .NET 8.0 134217728 66.757 ns 0.0824 ns 0.0731 ns 1.00 -
FrequencyBlock .NET 8.0 134217728 68.743 ns 0.0116 ns 0.0091 ns 1.03 -
FrequencyBlockAvxNotPinned .NET 8.0 134217728 54.986 ns 0.0419 ns 0.0350 ns 0.82 -
FrequencyBlockAvxPinned .NET 8.0 134217728 37.282 ns 0.0212 ns 0.0177 ns 0.56 -
FrequencyBlockAvxPinned512 .NET 8.0 134217728 47.147 ns 0.0315 ns 0.0263 ns 0.71 -
FrequencyFlat .NET 9.0 134217728 60.642 ns 0.0666 ns 0.0556 ns 1.00 -
FrequencyFlatAvx .NET 9.0 134217728 60.637 ns 0.0622 ns 0.0551 ns 1.00 -
FrequencyBlock .NET 9.0 134217728 68.890 ns 0.0244 ns 0.0190 ns 1.14 -
FrequencyBlockAvxNotPinned .NET 9.0 134217728 55.173 ns 0.0366 ns 0.0306 ns 0.91 -
FrequencyBlockAvxPinned .NET 9.0 134217728 37.273 ns 0.0219 ns 0.0183 ns 0.61 -
FrequencyBlockAvxPinned512 .NET 9.0 134217728 47.174 ns 0.0407 ns 0.0340 ns 0.78 -
Increment
Method Runtime Size Mean Error StdDev Ratio Allocated
IncFlat .NET 6.0 32768 13.873 ns 0.0028 ns 0.0022 ns 1.00 -
IncFlatAvx .NET 6.0 32768 13.867 ns 0.0020 ns 0.0016 ns 1.00 -
IncBlock .NET 6.0 32768 13.710 ns 0.0127 ns 0.0099 ns 0.99 -
IncBlockAvxNotPinned .NET 6.0 32768 13.457 ns 0.0370 ns 0.0346 ns 0.97 -
IncBlockAvxPinned .NET 6.0 32768 7.138 ns 0.0905 ns 0.0847 ns 0.51 -
IncBlockAvxPinned512 .NET 6.0 32768 7.073 ns 0.0850 ns 0.0795 ns 0.51 -
IncFlat .NET 8.0 32768 6.981 ns 0.0365 ns 0.0305 ns 1.00 -
IncFlatAvx .NET 8.0 32768 7.022 ns 0.0877 ns 0.0820 ns 1.01 -
IncBlock .NET 8.0 32768 10.222 ns 0.0205 ns 0.0171 ns 1.46 -
IncBlockAvxNotPinned .NET 8.0 32768 10.296 ns 0.0136 ns 0.0114 ns 1.47 -
IncBlockAvxPinned .NET 8.0 32768 6.027 ns 0.0102 ns 0.0079 ns 0.86 -
IncBlockAvxPinned512 .NET 8.0 32768 7.104 ns 0.0157 ns 0.0139 ns 1.02 -
IncFlat .NET 9.0 32768 6.939 ns 0.0680 ns 0.0603 ns 1.00 -
IncFlatAvx .NET 9.0 32768 6.929 ns 0.0503 ns 0.0470 ns 1.00 -
IncBlock .NET 9.0 32768 10.248 ns 0.0233 ns 0.0195 ns 1.48 -
IncBlockAvxNotPinned .NET 9.0 32768 10.319 ns 0.0173 ns 0.0145 ns 1.49 -
IncBlockAvxPinned .NET 9.0 32768 5.870 ns 0.0102 ns 0.0095 ns 0.85 -
IncBlockAvxPinned512 .NET 9.0 32768 7.057 ns 0.0165 ns 0.0146 ns 1.02 -
IncFlat .NET 6.0 524288 14.624 ns 0.0767 ns 0.0717 ns 1.00 -
IncFlatAvx .NET 6.0 524288 14.593 ns 0.0971 ns 0.0908 ns 1.00 -
IncBlock .NET 6.0 524288 16.352 ns 0.0878 ns 0.0686 ns 1.12 -
IncBlockAvxNotPinned .NET 6.0 524288 16.141 ns 0.2599 ns 0.2432 ns 1.10 -
IncBlockAvxPinned .NET 6.0 524288 7.353 ns 0.0637 ns 0.0596 ns 0.50 -
IncBlockAvxPinned512 .NET 6.0 524288 7.365 ns 0.0535 ns 0.0501 ns 0.50 -
IncFlat .NET 8.0 524288 7.613 ns 0.0582 ns 0.0545 ns 1.00 -
IncFlatAvx .NET 8.0 524288 7.628 ns 0.0625 ns 0.0585 ns 1.00 -
IncBlock .NET 8.0 524288 13.015 ns 0.0431 ns 0.0403 ns 1.71 -
IncBlockAvxNotPinned .NET 8.0 524288 13.110 ns 0.2030 ns 0.1799 ns 1.72 -
IncBlockAvxPinned .NET 8.0 524288 6.100 ns 0.0218 ns 0.0204 ns 0.80 -
IncBlockAvxPinned512 .NET 8.0 524288 7.210 ns 0.0978 ns 0.0915 ns 0.95 -
IncFlat .NET 9.0 524288 7.479 ns 0.0195 ns 0.0163 ns 1.00 -
IncFlatAvx .NET 9.0 524288 7.496 ns 0.0138 ns 0.0123 ns 1.00 -
IncBlock .NET 9.0 524288 12.706 ns 0.0715 ns 0.0669 ns 1.70 -
IncBlockAvxNotPinned .NET 9.0 524288 12.810 ns 0.1111 ns 0.1039 ns 1.71 -
IncBlockAvxPinned .NET 9.0 524288 5.909 ns 0.0572 ns 0.0535 ns 0.79 -
IncBlockAvxPinned512 .NET 9.0 524288 7.147 ns 0.0158 ns 0.0132 ns 0.96 -
IncFlat .NET 6.0 8388608 60.861 ns 0.2889 ns 0.2561 ns 1.00 -
IncFlatAvx .NET 6.0 8388608 60.658 ns 0.0395 ns 0.0350 ns 1.00 -
IncBlock .NET 6.0 8388608 52.722 ns 0.0395 ns 0.0330 ns 0.87 -
IncBlockAvxNotPinned .NET 6.0 8388608 52.394 ns 0.0464 ns 0.0411 ns 0.86 -
IncBlockAvxPinned .NET 6.0 8388608 27.460 ns 0.0598 ns 0.0560 ns 0.45 -
IncBlockAvxPinned512 .NET 6.0 8388608 27.413 ns 0.0733 ns 0.0686 ns 0.45 -
IncFlat .NET 8.0 8388608 33.872 ns 0.0271 ns 0.0212 ns 1.00 -
IncFlatAvx .NET 8.0 8388608 33.870 ns 0.0298 ns 0.0249 ns 1.00 -
IncBlock .NET 8.0 8388608 38.457 ns 0.0526 ns 0.0439 ns 1.14 -
IncBlockAvxNotPinned .NET 8.0 8388608 38.444 ns 0.0388 ns 0.0324 ns 1.13 -
IncBlockAvxPinned .NET 8.0 8388608 25.408 ns 0.0687 ns 0.0609 ns 0.75 -
IncBlockAvxPinned512 .NET 8.0 8388608 26.769 ns 0.0354 ns 0.0296 ns 0.79 -
IncFlat .NET 9.0 8388608 33.790 ns 0.0251 ns 0.0209 ns 1.00 -
IncFlatAvx .NET 9.0 8388608 33.953 ns 0.1603 ns 0.1499 ns 1.00 -
IncBlock .NET 9.0 8388608 37.733 ns 0.0348 ns 0.0308 ns 1.12 -
IncBlockAvxNotPinned .NET 9.0 8388608 38.233 ns 0.0137 ns 0.0114 ns 1.13 -
IncBlockAvxPinned .NET 9.0 8388608 21.088 ns 0.0823 ns 0.0730 ns 0.62 -
IncBlockAvxPinned512 .NET 9.0 8388608 26.732 ns 0.0146 ns 0.0114 ns 0.79 -
IncFlat .NET 6.0 134217728 68.095 ns 0.0825 ns 0.0731 ns 1.00 -
IncFlatAvx .NET 6.0 134217728 68.141 ns 0.0526 ns 0.0411 ns 1.00 -
IncBlock .NET 6.0 134217728 63.127 ns 0.0191 ns 0.0149 ns 0.93 -
IncBlockAvxNotPinned .NET 6.0 134217728 62.820 ns 0.0723 ns 0.0604 ns 0.92 -
IncBlockAvxPinned .NET 6.0 134217728 31.550 ns 0.0290 ns 0.0227 ns 0.46 -
IncBlockAvxPinned512 .NET 6.0 134217728 31.612 ns 0.1224 ns 0.1145 ns 0.46 -
IncFlat .NET 8.0 134217728 39.336 ns 0.0513 ns 0.0480 ns 1.00 -
IncFlatAvx .NET 8.0 134217728 39.342 ns 0.0894 ns 0.0747 ns 1.00 -
IncBlock .NET 8.0 134217728 44.146 ns 0.0190 ns 0.0148 ns 1.12 -
IncBlockAvxNotPinned .NET 8.0 134217728 44.200 ns 0.1783 ns 0.1581 ns 1.12 -
IncBlockAvxPinned .NET 8.0 134217728 29.790 ns 0.5059 ns 0.4484 ns 0.76 -
IncBlockAvxPinned512 .NET 8.0 134217728 30.841 ns 0.0408 ns 0.0362 ns 0.78 -
IncFlat .NET 9.0 134217728 39.018 ns 0.0363 ns 0.0303 ns 1.00 -
IncFlatAvx .NET 9.0 134217728 39.029 ns 0.0241 ns 0.0201 ns 1.00 -
IncBlock .NET 9.0 134217728 43.988 ns 0.0160 ns 0.0125 ns 1.13 -
IncBlockAvxNotPinned .NET 9.0 134217728 43.979 ns 0.0466 ns 0.0364 ns 1.13 -
IncBlockAvxPinned .NET 9.0 134217728 27.975 ns 0.0190 ns 0.0149 ns 0.72 -
IncBlockAvxPinned512 .NET 9.0 134217728 30.880 ns 0.0324 ns 0.0253 ns 0.79 -

@bitfaster bitfaster marked this pull request as ready for review November 29, 2024 00:10
@bitfaster bitfaster merged commit 5669d38 into main Dec 1, 2024
13 checks passed
@bitfaster bitfaster deleted the users/alexpeck/neon branch December 1, 2024 00:45
@bitfaster bitfaster mentioned this pull request Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants