Skip to content

Conversation

PrzemyslawKlys
Copy link
Contributor

@PrzemyslawKlys PrzemyslawKlys commented Aug 17, 2025

I saw you comment in my PR and thought I would do some testing and what you basically see in the:

  • results-original folder is what we have with original dsijson

BenchmarkDotNet v0.15.2, Windows 11 (10.0.26100.4652/24H2/2024Update/HudsonValley)
Unknown processor
.NET SDK 9.0.304
  [Host]     : .NET 8.0.19 (8.0.1925.36514), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  DefaultJob : .NET 8.0.19 (8.0.1925.36514), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI


Method Mean Error StdDev Median Gen0 Gen1 Gen2 Allocated
STJ_String_Small 279.8 ns 4.15 ns 3.88 ns 279.0 ns 0.0448 - - 752 B
STJ_Bytes_Small 342.9 ns 6.37 ns 5.96 ns 342.5 ns 0.1063 0.0005 - 1784 B
STJ_String_Medium 631.9 ns 28.32 ns 83.50 ns 649.3 ns 0.2584 0.0038 - 4336 B
STJ_Bytes_Medium 777.3 ns 16.94 ns 48.33 ns 763.0 ns 0.7496 0.0334 - 12536 B
STJ_String_Large 4,043.0 ns 66.74 ns 52.11 ns 4,041.0 ns 3.9215 0.7782 - 65776 B
STJ_Bytes_Large 36,013.7 ns 743.78 ns 2,193.05 ns 35,435.2 ns 41.6565 41.6565 41.6565 196884 B
  • results-improvements is what we have for 2nd version

BenchmarkDotNet v0.15.2, Windows 11 (10.0.26100.4652/24H2/2024Update/HudsonValley)
Unknown processor
.NET SDK 9.0.304
  [Host]     : .NET 8.0.19 (8.0.1925.36514), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  DefaultJob : .NET 8.0.19 (8.0.1925.36514), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI


Method Mean Error StdDev Gen0 Gen1 Allocated
STJ_String_Small 287.1 ns 3.48 ns 3.25 ns 0.0448 - 752 B
STJ_Bytes_Small 247.2 ns 2.79 ns 2.18 ns 0.0448 - 752 B
STJ_String_Medium 499.8 ns 8.72 ns 6.81 ns 0.2584 0.0038 4336 B
STJ_Bytes_Medium 512.3 ns 13.51 ns 38.77 ns 0.2589 0.0038 4336 B
STJ_String_Large 4,610.5 ns 114.34 ns 337.12 ns 3.9215 0.7782 65776 B
STJ_Bytes_Large 3,421.0 ns 72.74 ns 208.71 ns 3.9215 0.7820 65776 B

With some optimizations and some issues.

Benchmark Comparison – Original vs New JSON Parser

📊 Results

Scenario Original (mean) New (mean) Change Allocations
Small JSON (string) ~280 ns ~287 ns ≈ same (slightly slower) 752 B → 752 B
Small JSON (bytes) ~343 ns ~247 ns ≈ 28% faster 1784 B → 752 B
Medium JSON (string) ~632 ns ~500 ns ≈ 20% faster 4336 B → 4336 B
Medium JSON (bytes) ~777 ns ~512 ns ≈ 34% faster 12536 B → 4336 B
Large JSON (string) ~4,043 ns ~4,610 ns ≈ 14% slower 65776 B → 65776 B
Large JSON (bytes) ~36,014 ns ~3,421 ns ≈ 10× faster 196,884 B → 65,776 B

✅ Improvements

  • Byte input path is dramatically faster, especially for large payloads (≈10× speedup).
  • Allocations for byte input dropped significantly (no extra intermediate buffers).
  • Medium string path also improved (~20% faster).

⚠️ Regressions

  • Large string path is slower (~14%) since the focus shifted to optimizing UTF-8 byte input.

TL;DR

  • Small JSON: essentially unchanged.
  • Medium JSON: ~20–34% faster.
  • Large JSON: bytes ~10× faster, strings ~14% slower.

If the hot path is byte parsing (LDAP blobs, AD dumps, LAPS JSON), the new implementation is clearly better overall.

…lable types

* Changed return type of `DeserializeLenient<T>` methods to `T?` for better null handling.
* Improved handling of JSON input by adding checks for empty or whitespace strings.
* Enhanced UTF-8 and UTF-16 decoding paths for more robust JSON processing.
* Added helper methods to streamline BOM trimming and single-quote detection.
@MichaelGrafnetter
Copy link
Owner

Hi, thanks. Please send any changes to the parser in a standalone PR, as I do not want to get the benchmarks and results into the repo.

My question still stays unanswered though: What impact would migration from byte[] to Memory<> have on reading 100K or 1M computers from an AD DB? Would the decreased number of allocations be even noticeable? Similarly with parsing secrets and supplemental credentials.

@PrzemyslawKlys
Copy link
Contributor Author

I submitted another PR which seems to have better results the this one. I will clean PR from the code you don't need once (and if) you decide to have it. I wanted to make sure you have a full picture from tests and not just blindly trust in results.

The summary is for the new version, not this one, but it works differently, and required a test change to accommodate for it.

Impact of Migrating from byte[] to Span<>/Memory<>

🔑 What changes

  • Operate on slices of the original buffer instead of allocating new byte[].
  • Use APIs that accept ReadOnlySpan<byte> (JSON, crypto, ASN.1).
  • Allocate only when materializing final strings or arrays.

📊 Impact at scale

Even modest per-record savings add up:

Per-record saved 100K records 1M records
~1 KB ~100 MB less ~1 GB less
~8 KB ~0.8 GB less ~8 GB less
~128 KB ~12 GB less ~122 GB less
  • Benchmarks already show 2/3 fewer allocations and up to 10× faster for large UTF-8 JSON.
  • Less GC pressure → fewer pauses, smoother throughput.

🧩 Where it helps most

  • AD exports / KeyCredential parsing: slice BLOBs, avoid ReadBytes.
  • LAPS JSON / device keys: deserialize directly from ReadOnlySpan<byte>.
  • Secrets & supplemental credentials: span-based crypto imports (no .ToArray()).

At 100K–1M objects:

  • Allocation reduction is absolutely noticeable (hundreds of MB to >100 GB depending on payload size).
  • For copy-heavy JSON paths, throughput can improve multiplicatively.
  • For crypto-heavy paths, wall-clock wins are smaller but GC pressure still drops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants