Skip to content

Removing toLowerCase #161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Removing toLowerCase #161

wants to merge 1 commit into from

Conversation

shortstuffsushi
Copy link
Contributor

Is there a reason all of the header fields are being lower cased? This actually caused requests I was making to a server I don't have control over to fail.

Is there a reason all of the header fields are being lower cased? This actually caused requests I was making to a server I don't have control over to fail.
@chrisirhc
Copy link

Header field names are supposed to be case insensitive as per RFC (http://www.ietf.org/rfc/rfc2616.txt). If you rely on the case of the field names, perhaps you should use a custom subclassed implementation of the handshake?

@shortstuffsushi
Copy link
Contributor Author

I agree they are supposed to be insensitive, but relying on all server implementations to respect that seems like a bad idea (as evidenced by this example).

It seems that not forcing lower case would make more sense, so that servers like this one that don't respect the protocol don't cause this framework to fail.

It also seems questionable, perhaps inappropriate even, that this framework should modify headers passed to it.

@shortstuffsushi
Copy link
Contributor Author

It also appears that the toLowerCase was only added as a fix for issue #98, which mentions the implementation being case sensitive.

@chrisirhc
Copy link

You're right that this framework shouldn't be modifying the headers passed to it.
In my opinion, a better approach would be to store the header field names in their original case (preserve the case) but allow hasFieldValue and getFieldValue to operate in a case-insensitive manner.

One way to do this is to keep an additional map to get the original key with the preserved case. (lowercasekey -> originalKey)

This ensures compatibility with the specification as well as preserves case of header fields.

@shortstuffsushi
Copy link
Contributor Author

Interesting idea. I think that seems pretty reasonable, so I'll go ahead and modify my patch to do that.

@shortstuffsushi
Copy link
Contributor Author

Oops, accidentally closed this request. I'll be submitting another one soon. I was hoping for a slightly more graceful solution that two, duplicate dictionaries, so I changed the implementation of the of the map to be a TreeMap. The TreeMap allows for case insensitive puts and gets.

My only remaining question is, was there an original reason for using LinkedHashMap? Will this change potentially break any functionality?

@chrisirhc
Copy link

It's not really a more graceful solution but I was referring to having another dictionary that gets the original key from the lowercased key.

For example:

keyMap["content-length"] -> "Content-Length"

So the hasFieldValue becomes:

return keyMap.containsKey( name.toLowerCase( Locale.ENGLISH ) );

getFieldValue becomes:

String s = map.get( keyMap[ name.toLowerCase( Locale.ENGLISH ) ] );

@shortstuffsushi
Copy link
Contributor Author

That is the other solution I had in mind. However, using TreeMap avoids the second dictionary entirely --

hasFieldValue remains

return map.containsKey( name );

and getFieldValue remains

String s = map.get( name );

You don't have to call toLowerCase on any of the parameters, you dont have to modify anything as it comes in, and you don't have to keep a second dictionary for anything.

If you'd truly rather have it have both, I can refactor it again to use that, but it seems like using a single, case-insensitive map would be simpler.

@chrisirhc
Copy link

Are you overriding the methods in TreeMap to do this? I couldn't find out how TreeMap is case-insensitive.

@shortstuffsushi
Copy link
Contributor Author

I'll push my solution, which uses it. We can further discuss on the new pull request. Or, if you just want to run a quick local test, check out this Gist -- https://gist.github.com/shortstuffsushi/5149244

@chrisirhc
Copy link

I see. That's an interesting hack using the String.CASE_INSENSITIVE_ORDER Comparator to fool TreeMap to do case-insensitive comparisons. It's pretty cool but I'm worried that, as stated in the documentation, it no longer behaves like a Map as the Comparator is no longer consistent with the equals operation.

It does make things more compact, but it's not really a map. I'm not sure about this.

@shortstuffsushi
Copy link
Contributor Author

In a sense, no, it doesn't maintain its contract. What I take this to mean is that, if you were to insert two headers named the same, but cased differently, the latter would replace the former. Isn't that the expected behavior in this case though, because the headers are supposed to be case insensitive, such that if you shouldn't be able to pass two with the same name but separate casing?

@chrisirhc
Copy link

Yep it is the intended expected behavior for our use case. I'm just wondering about the implementation.
We'll have to see what the maintainer says about this.

I prefer to have the special behavior clearly stated in the code (or at least in comments), showing that a header with a field name of a different case will replace a previous header.

I was also hoping to keep everything in HashMaps to ensure constant time in retrieval while TreeMaps are implemented differently. This may not actually matter that much as there many not be a large number of headers for performance to really matter.

That being said, I like how the TreeMaps implementation is so clean. Let's see what the maintainer says then. 👍

@shortstuffsushi
Copy link
Contributor Author

Alright, I'll wait to hear what he says. Oh, and TreeMaps are a red/black tree, so they're O log(n)... Not too bad, especially given, as you mention, the general size of the header map :)

@Davidiusdadi
Copy link
Collaborator

@shortstuffsushi I like how compact your solution is. In terms of performance there should not be a relevant difference. Maybe the dropped out toLowerCase alone compensates for the "slower" lookup when talking about a map of the size ~10.

Also i think that barely anyone except @shortstuffsushi will need this "fix" i still think that is is a general improvement of compatibility which is why i think that we should merge.

Can we agree on merging? ( I would add an explain comment afterwards into the code.)

@chrisirhc
Copy link

I think merging this (#162) is important. 👍
I recently got bitten by this on another codebase of mine that relied on looking up a header (Content-Length) and different servers returned that header with a different case (e.g. Content-length, content-length, ...).

@Davidiusdadi
Copy link
Collaborator

i also meant #162 :)
i will merge it then.

Thank you both for your improving and supporting the library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants