TreeInterpreter
creates reference cycle, causing GC pressure
#291
Labels
TreeInterpreter
creates reference cycle, causing GC pressure
#291
We recently noticed that a heavy JMESpath workload was triggering a large number of garbage collection runs. We are using
jmespath.compile()
, and we tracked this down to thejmespath.visitor.TreeInterpreter
that is created on every call to `ParsedResult.search():jmespath.py/jmespath/parser.py
Line 508 in bbe7300
It appears that
TreeInterpreter
creates a reference cycle, which leads to the GC being triggered frequently to clean up the cycles. As far as I can tell, the problem comes from theVisitor._method_cache
:jmespath.py/jmespath/visitor.py
Lines 91 to 93 in bbe7300
...which store references to methods that are bound to
self
in a member ofself
.Possible solution
We worked around the problem by monkey patching
ParsedResult
so that it (1) caches adefault_interpreter
for use whenoptions=None
, and (2) uses it insearch()
. If I understand correctly, we could go further and use a globalTreeInterpreter
for allParsedResult
instances. TheTreeInterpreter
seems to be stateless apart fromself._method_cache
and that implementation seems to be thread-safe (with only the risk of multiple lookups for the same method in a multithreaded case).I'd be happy to contribute a PR for either version if this would be welcome.
How to reproduce
The following reproducer shows the problem:
...where the output contains one million repetitions of something like:
The text was updated successfully, but these errors were encountered: