Skip to content

Reduce deepCopy calls #40

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jayanderson opened this issue Nov 22, 2016 · 7 comments
Open

Reduce deepCopy calls #40

jayanderson opened this issue Nov 22, 2016 · 7 comments

Comments

@jayanderson
Copy link

When there are many operations to run successively and when the document is large performing deepCopy for each operation becomes extremely expensive.

Below is an example of the change I'm talking about with only 'replace':

diff --git a/src/main/java/com/github/fge/jsonpatch/JsonPatch.java b/src/main/java/com/github/fge/jsonpatch/JsonPatch.java
index 178ab86..9f0a3d4 100644
--- a/src/main/java/com/github/fge/jsonpatch/JsonPatch.java
+++ b/src/main/java/com/github/fge/jsonpatch/JsonPatch.java
@@ -142,7 +142,7 @@ public final class JsonPatch
         throws JsonPatchException
     {
         BUNDLE.checkNotNull(node, "jsonPatch.nullInput");
-        JsonNode ret = node;
+        JsonNode ret = node.deepCopy();
         for (final JsonPatchOperation operation: operations)
             ret = operation.apply(ret);
 
diff --git a/src/main/java/com/github/fge/jsonpatch/ReplaceOperation.java b/src/main/java/com/github/fge/jsonpatch/ReplaceOperation.java
index 72405a6..fca2fab 100644
--- a/src/main/java/com/github/fge/jsonpatch/ReplaceOperation.java
+++ b/src/main/java/com/github/fge/jsonpatch/ReplaceOperation.java
@@ -69,7 +69,7 @@ public final class ReplaceOperation
         final JsonNode replacement = value.deepCopy();
         if (path.isEmpty())
             return replacement;
-        final JsonNode ret = node.deepCopy();
+        final JsonNode ret = node;
         final JsonNode parent = path.parent().get(ret);
         final String rawToken = Iterables.getLast(path).getToken().getRaw();
         if (parent.isObject())

Instead of performing the copy in each operation it is performed once at the top level. The above will fail tests since they expect ReplaceOperation to copy the node. Beyond that are there any edge cases this type of change would break? Are there other ways to reduce copying? Thanks.

@jayanderson
Copy link
Author

jayanderson@f6ecd0d - one possibility for a fix. This keeps the apply api the same (performing a deepCopy) while adding a applyMutating which can be called internal when a copy isn't necessary.

@davidpolberger
Copy link

jayanderson@f6ecd0d has a huge effect on performance. With a 440 kB patch, containing around 2,000 changes, applied to a 3.5 MB model (both minified), a test program using the unpatched version of the library takes 22 seconds to run on my computer. With the patched version of the library, the entire run takes 20 milliseconds. For comparison, the zjsonpatch library takes 37 milliseconds to run. All three runs produced identical output.

@huggsboson
Copy link
Member

I'm down to merge something here, but I'm not sure if I understand why the deepCopy was implemented in the first place and I'm somewhat fearful of introducing a regression for existing users.

@huggsboson
Copy link
Member

Looks like the @jayanderson suggestion could be workable.

@huggsboson
Copy link
Member

Any chance you can submit it as a PR? (Using applyMutable)

@davidpolberger
Copy link

My experience with this library is limited, so you may want to take this with a grain of salt, but the patch looks safe to me. Instead of needlessly cloning the entire tree once for every visited node (which is very computationally expensive), jayanderson/json-patch@f6ecd0d clones the tree once and then mutates this copy in-place during the patch operation.

Making a single copy of the tree in JsonPatchOperation is prudent, though, because applying a patch may fail, in which case an instance of JsonPatchOperation is thrown. Had a defensive copy of the tree not been made, the tree would have been left in an inconsistent state. A patch should, in my opinion, either be applied in its entirety or not be applied at all.

For what it's worth, we've been running this patch in production for ten days now with no ill effects.

@davidpolberger
Copy link

An update on my last comment: we've now been running the patch in production for almost three months and everything works well, with substantially better performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants