Skip to content

EDN List, Vector types indistinguishable due to common RandomAccess interface #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
abernard opened this issue Jun 12, 2013 · 5 comments
Milestone

Comments

@abernard
Copy link
Contributor

This may be a design decision, but it is not clear so here goes.

The documentation states "Lists "(...)" and vectors "[...]" are both mapped to implementations of java.util.List. A vector maps to a List implementation that also implements the marker interface java.util.RandomAccess."

However, due to this commit which changes the backing of the List type to ArrayList instead of LinkedList, there is no way to distinguish between Lists or Vectors when "roundtripped" through the parser.

This means that "(1, 2, 3)" and "[1, 2, 3]" will parse to identical values. Printing back the parsed List "(1, 2, 3)" to EDN via the printer will yield the Vector "[1, 2, 3]".

This is not caught in the unit tests because the method assertEquals merely tests that the parsed objects are equal (they are, as Vectors), but does not compare a second round of printing with the original string.

@bpsm
Copy link
Owner

bpsm commented Jun 12, 2013

Yes, that's a bug, though what this commit says is true: ArrayLists are indeed better than LinkedLists in that they use less memory and are faster to iterate.

Hanging the whole List/Vector distinction on the presence or absence of the RandomAccess marker interface is a bit of a hack. I'm not very happy with it and find myself wishing for Clojure-like meta-data. This strategy also presents a problem for edn-java-guava, since Guava's immutable list classes all implement RandomAccess.

Two alternatives that might be better than reverting the commit above to return java.util.LinkedList implementations:

  1. Return a java.util.List which merely wraps a java.util.ArrayList, but does not itself implement RandomAccess.
  2. Return a custom implementation of java.util.List which chunks by actually being a list of fixed-size chunks under the hood.

The downside of doing nothing is what this issue is about. The downside of 1 is the additional indirection introduced for delegation. (Though Iteration can be direct.) The downside of 2 is the implementation effort and probability of introducing bugs.

I'm thinking the first approach sounds like it's worth it; the second is unlikely to be worth it.

What do you think?

@bpsm
Copy link
Owner

bpsm commented Jun 13, 2013

@mikera
Copy link
Contributor

mikera commented Jun 13, 2013

Why not just do a lightweight custom implementation of ArrayList? I don't think chunking actually buys us anything here, indeed it's probably going to add more overhead than an extra layer of indirection.

If you are interested in that approach, I think I've got some old code hanging around that would get us 80% there....

@abernard
Copy link
Contributor Author

Ben,

I agree with your rationale. LinkedList has poor performance characteristics and option 2 seems like asking for trouble. Your provided solution is quite clean.

On the issue of this handling Guava collections, I wonder if the code to determine List or Vector should be separated out into an interface. This would be something like:

interface SequenceTypeSelector {
    boolean isVector(Object o);
}

The selector could be attached to the ProtocolBuilder for the Printer (with a default implementation provided of course). Extending SequenceTypeSelector would allow custom dispatch for types, allowing the simple if-else select on java.util.RandomAccess, or a Map lookup for more complex type hierarchies.

@bpsm
Copy link
Owner

bpsm commented Jun 13, 2013

I'm going to release https://github.com/bpsm/edn-java/commits/feature/issue32 as 0.4.2. Making distinguishing between list and vector pluggable as been moved to issue #33 and tentatively assigned to milestone 0.5.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants