Skip to content

Image upload sometimes stalls with HTTP/2 #3559

@luqmana

Description

@luqmana

This was the image upload failing on Firefox/macOS bug that @david-crespo was running into.

I've looked into it some more and from what I can tell, at some point the browser gets stalled while uploading some chunks.
On the console side we split up the file into 384KiB chunks which we try to upload 6 at a time (to not hit browser concurrency limits). It doesn't happen every time, but every so often there will be a chunk or two where it seems like the browser made the request but there's no response. (At least from the browser Web Developer Tools Network tab).

I changed the console side to add a query parameter on each individual chunk upload and for the stalled chunks I saw no mention of such a request in the Nexus logs. Next I tried a packet capture with Wireshark and after setting up SSLKEYLOGFILE (because this will only repro with https; we'll get back to that) I did see the browser making those requests. And in fact, I could see it sending until at some point it stops with still no response.

After not being able to repro without TLS and then seeing the packet cap, I realized we're using HTTP/2 for compatible clients. The browser is maintaining a single connection and using multiple HTTP/2 streams to make the different requests. Ok, so is something else blocking our image upload somehow? Cue some more reading about HTTP/2 and it definitely has a concept of flow control that each peer maintains separately.

Basically, for each side, there's a connection level flow control window size as well as a per-stream size. Every byte sent decrements the available bytes in the window. If the sender has exhausted the window size, they must not send anymore until a WINDOW_UPDATE is received from the peer that tells it there's more space.

I need to look into it some more but it seems like the browser might think its exhausted the window but there's no window update from the nexus size. Hacked in a tracing subscriber to see anything useful from hyper and I do see mentions of the stalled streams but not super familier with hyper enough to decode them yet. (hyperium/hyper#2899 seems relevant maybe?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    known issueTo include in customer documentation and training

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions