Skip to content

Performance fix (Python 2.7 only): Make sure a buffer is added to socket file object #277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

dbasden
Copy link
Contributor

@dbasden dbasden commented Oct 14, 2013

By default httplib doesn't use buffering when converting sockets to file objects for reading.

httplib also reads in headers from the HTTP response with one read() per byte in the headers.

With the default behaviour of httplib, this means that there can be (for example under Linux) one recvfrom() syscall per byte in the header. The overhead of using the default unbuffered sockets is that it can add around 1 millisecond of time per byte in the headers to do a request with httplib.

This blows out the time for any riak call over HTTP by roughly 400ms just in python's httplib. riak itself is sending all HTTP response headers in one TCP frame, so none of this is latency on the wire.

This is covered by python bug http://bugs.python.org/issue4879

To resolve this, in Python 2.7 you can now specify 'buffering=True' to set a buffer when making a file object from the socket.

Unfortunately for Python 2.6 or earlier this is going to fail, and httplib is hardcoded to use an unbuffered file object. From the HTTPResponse constructor in python2.6/httplib.py:330

self.fp = sock.makefile('rb', 0)

so there isn't really a clean solution that I can see using the built-in httplib library with earlier python versions. To allow it to at least be faster in python2.7, surrounding it with a try/catch for NameError would allow the code to at least run on earlier versions (no slower than it is now)

@shuhaowu
Copy link
Contributor

if sys.version_info[:2] >= (2, 7):
    # 2.7 code
else:
    # 2.6 and below

Also watch out for py3k

@seancribbs
Copy link

@dbasden This is fine, but I agree with @shuhaowu that a version-switch would be appropriate.

@seancribbs
Copy link

Interestingly, that option is not documented (even though it is definitely available).

@seancribbs seancribbs added this to the 2.0.1 milestone Jul 1, 2014
@lukebakken lukebakken self-assigned this Apr 30, 2016
lukebakken pushed a commit that referenced this pull request May 2, 2016
lukebakken pushed a commit that referenced this pull request May 2, 2016
@lukebakken
Copy link
Contributor

lukebakken commented May 2, 2016

Incorporated in #456 (CLIENTS-24)

@lukebakken lukebakken closed this May 2, 2016
lukebakken referenced this pull request May 2, 2016
Use buffering for retrieval of HTTP headers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants