Skip to content

jsonchema: add memoization and cycle detection (#77) #97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

albertsundjaja
Copy link
Contributor

@albertsundjaja albertsundjaja commented Jul 5, 2025

This PR adds:

  1. Cycle detection: return error instead of waiting for stack overflow*
  2. Memoization: performance improvement**

*This is a fix for #77
The root cause of the issue was the type being processed is deeply nested and has cycle. Hence, the heap memory growth is faster than the stack growth, it went OOM before stack overflow panic can kick in

**I posted a perf test in the issue thread #77
Copying it here for visibility

// Without memoization
1. Testing SimpleStruct:
Running 10000 iterations...
Performance: 22.360083ms for 10000 iterations
Average per iteration: 2.236µs

2. Testing NestedStruct:
Running 10000 iterations...
Performance: 79.656125ms for 10000 iterations
Average per iteration: 7.965µs

// Memoization with deep copy -- implemented in this PR
1. Testing SimpleStruct:
Running 10000 iterations...
Performance: 14.800583ms for 10000 iterations
Average per iteration: 1.48µs

2. Testing NestedStruct:
Running 10000 iterations...
Performance: 50.574417ms for 10000 iterations
Average per iteration: 5.057µs

// With memoization using gob
1. Testing SimpleStruct:
Running 10000 iterations...
Performance: 515.872416ms for 10000 iterations
Average per iteration: 51.587µs

2. Testing NestedStruct:
Running 10000 iterations...
Performance: 646.161875ms for 10000 iterations
Average per iteration: 64.616µs

arguably, the benefit of the memoization is minimal. In addition, in the case of the caller invokes For with a struct that contains a cycle, some of the fields in that struct might be cached which incur unnecessary memory cost. I'm happy to remove the caching and just keep the cycle detector

@jba
Copy link
Contributor

jba commented Jul 7, 2025

Thanks. I agree, let's drop the caching and keep the cycle detection.

@@ -39,15 +40,18 @@ import (
// The types must not have cycles.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also change this to "It also returns an error if there is a cycle in the types."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants