Removing toLowerCase #161

shortstuffsushi · 2013-03-13T01:18:14Z

Is there a reason all of the header fields are being lower cased? This actually caused requests I was making to a server I don't have control over to fail.

chrisirhc · 2013-03-13T01:47:08Z

Header field names are supposed to be case insensitive as per RFC (http://www.ietf.org/rfc/rfc2616.txt). If you rely on the case of the field names, perhaps you should use a custom subclassed implementation of the handshake?

shortstuffsushi · 2013-03-13T01:57:57Z

I agree they are supposed to be insensitive, but relying on all server implementations to respect that seems like a bad idea (as evidenced by this example).

It seems that not forcing lower case would make more sense, so that servers like this one that don't respect the protocol don't cause this framework to fail.

It also seems questionable, perhaps inappropriate even, that this framework should modify headers passed to it.

shortstuffsushi · 2013-03-13T02:02:29Z

It also appears that the toLowerCase was only added as a fix for issue #98, which mentions the implementation being case sensitive.

chrisirhc · 2013-03-13T02:35:10Z

You're right that this framework shouldn't be modifying the headers passed to it.
In my opinion, a better approach would be to store the header field names in their original case (preserve the case) but allow hasFieldValue and getFieldValue to operate in a case-insensitive manner.

One way to do this is to keep an additional map to get the original key with the preserved case. (lowercasekey -> originalKey)

This ensures compatibility with the specification as well as preserves case of header fields.

shortstuffsushi · 2013-03-13T02:38:41Z

Interesting idea. I think that seems pretty reasonable, so I'll go ahead and modify my patch to do that.

shortstuffsushi · 2013-03-13T03:22:14Z

Oops, accidentally closed this request. I'll be submitting another one soon. I was hoping for a slightly more graceful solution that two, duplicate dictionaries, so I changed the implementation of the of the map to be a TreeMap. The TreeMap allows for case insensitive puts and gets.

My only remaining question is, was there an original reason for using LinkedHashMap? Will this change potentially break any functionality?

chrisirhc · 2013-03-13T03:31:30Z

It's not really a more graceful solution but I was referring to having another dictionary that gets the original key from the lowercased key.

For example:

keyMap["content-length"] -> "Content-Length"

So the hasFieldValue becomes:

return keyMap.containsKey( name.toLowerCase( Locale.ENGLISH ) );

getFieldValue becomes:

String s = map.get( keyMap[ name.toLowerCase( Locale.ENGLISH ) ] );

shortstuffsushi · 2013-03-13T03:39:37Z

That is the other solution I had in mind. However, using TreeMap avoids the second dictionary entirely --

hasFieldValue remains

return map.containsKey( name );

and getFieldValue remains

String s = map.get( name );

You don't have to call toLowerCase on any of the parameters, you dont have to modify anything as it comes in, and you don't have to keep a second dictionary for anything.

If you'd truly rather have it have both, I can refactor it again to use that, but it seems like using a single, case-insensitive map would be simpler.

chrisirhc · 2013-03-13T03:42:00Z

Are you overriding the methods in TreeMap to do this? I couldn't find out how TreeMap is case-insensitive.

shortstuffsushi · 2013-03-13T03:47:01Z

I'll push my solution, which uses it. We can further discuss on the new pull request. Or, if you just want to run a quick local test, check out this Gist -- https://gist.github.com/shortstuffsushi/5149244

chrisirhc · 2013-03-13T03:53:46Z

I see. That's an interesting hack using the String.CASE_INSENSITIVE_ORDER Comparator to fool TreeMap to do case-insensitive comparisons. It's pretty cool but I'm worried that, as stated in the documentation, it no longer behaves like a Map as the Comparator is no longer consistent with the equals operation.

It does make things more compact, but it's not really a map. I'm not sure about this.

shortstuffsushi · 2013-03-13T04:03:30Z

In a sense, no, it doesn't maintain its contract. What I take this to mean is that, if you were to insert two headers named the same, but cased differently, the latter would replace the former. Isn't that the expected behavior in this case though, because the headers are supposed to be case insensitive, such that if you shouldn't be able to pass two with the same name but separate casing?

chrisirhc · 2013-03-13T06:35:26Z

Yep it is the intended expected behavior for our use case. I'm just wondering about the implementation.
We'll have to see what the maintainer says about this.

I prefer to have the special behavior clearly stated in the code (or at least in comments), showing that a header with a field name of a different case will replace a previous header.

I was also hoping to keep everything in HashMaps to ensure constant time in retrieval while TreeMaps are implemented differently. This may not actually matter that much as there many not be a large number of headers for performance to really matter.

That being said, I like how the TreeMaps implementation is so clean. Let's see what the maintainer says then. 👍

shortstuffsushi · 2013-03-13T12:20:57Z

Alright, I'll wait to hear what he says. Oh, and TreeMaps are a red/black tree, so they're O log(n)... Not too bad, especially given, as you mention, the general size of the header map :)

Davidiusdadi · 2013-03-15T18:47:07Z

@shortstuffsushi I like how compact your solution is. In terms of performance there should not be a relevant difference. Maybe the dropped out toLowerCase alone compensates for the "slower" lookup when talking about a map of the size ~10.

Also i think that barely anyone except @shortstuffsushi will need this "fix" i still think that is is a general improvement of compatibility which is why i think that we should merge.

Can we agree on merging? ( I would add an explain comment afterwards into the code.)

chrisirhc · 2013-03-15T19:11:29Z

I think merging this (#162) is important. 👍
I recently got bitten by this on another codebase of mine that relied on looking up a header (Content-Length) and different servers returned that header with a different case (e.g. Content-length, content-length, ...).

Davidiusdadi · 2013-03-15T20:07:41Z

i also meant #162 :)
i will merge it then.

Thank you both for your improving and supporting the library.

Removing toLowerCase

0ddee5f

Is there a reason all of the header fields are being lower cased? This actually caused requests I was making to a server I don't have control over to fail.

shortstuffsushi closed this Mar 13, 2013

Davidiusdadi mentioned this pull request May 16, 2013

Solves issues with the CONNECT response being misinterpreted and others #179

Closed

mingot mentioned this pull request Sep 5, 2013

Java WebSocket still implements toLowerCase BrendanAndrade/web-matlab-bridge#4

Closed

Removing toLowerCase #161

Removing toLowerCase #161

Uh oh!

Conversation

shortstuffsushi commented Mar 13, 2013

Uh oh!

chrisirhc commented Mar 13, 2013

Uh oh!

shortstuffsushi commented Mar 13, 2013

Uh oh!

shortstuffsushi commented Mar 13, 2013

Uh oh!

chrisirhc commented Mar 13, 2013

Uh oh!

shortstuffsushi commented Mar 13, 2013

Uh oh!

shortstuffsushi commented Mar 13, 2013

Uh oh!

chrisirhc commented Mar 13, 2013

Uh oh!

shortstuffsushi commented Mar 13, 2013

Uh oh!

chrisirhc commented Mar 13, 2013

Uh oh!

shortstuffsushi commented Mar 13, 2013

Uh oh!

chrisirhc commented Mar 13, 2013

Uh oh!

shortstuffsushi commented Mar 13, 2013

Uh oh!

chrisirhc commented Mar 13, 2013

Uh oh!

shortstuffsushi commented Mar 13, 2013

Uh oh!

Davidiusdadi commented Mar 15, 2013

Uh oh!

chrisirhc commented Mar 15, 2013

Uh oh!

Davidiusdadi commented Mar 15, 2013

Uh oh!

Uh oh!