Skip to content

Conversation

mkeeter
Copy link
Contributor

@mkeeter mkeeter commented Sep 3, 2024

Right now, backpressure goes to infinity as you approach the max value. However, it's possible to exceed max if you (1) start below max and (2) send a single job that pushes over the limit (e.g. a large write). If this happens, then we special-case a 1 hour backpressure delay.

All of this is kinda awkward.

This PR cleans it up:

  • There's an explicit delay_max stored in the backpressure configuration, which is both a clamp and the delay used if our counter exceeds max_value
  • Backpressure now has an explicit Saturated value, which is used if we exceed max_value
  • If backpressure is saturated, then we return an error to the guest immediately (instead of delaying the IO for 1 hour)

Copy link
Contributor

@faithanalog faithanalog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requested some changes but the main logic of the PR makes sense to me

@mkeeter mkeeter force-pushed the mkeeter/backpressure-cleanup branch from dc3130f to 97d1fdd Compare September 4, 2024 15:35
BackpressureAmount::Saturated => {
let err = "write queue is saturated";
error!(self.log, "{err}");
Err(CrucibleError::IoError(err.to_owned()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should have a separate CrucibleError variant, but maybe that only makes sense if there's some meaningful NVMe error type that Propolis can return to the guest?

Copy link
Contributor

@leftwo leftwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about returning errors rather than retrying forever (albit with longer and longer delays), so I thought I would at least document my concerns to allow for more discussion around them.

At some point, we don't have the memory to keep taking IOs in from the guest, so I guess we can't just keep queueing forever. That being said, what would it take for a guest to generate the traffic to get us saturated? I suppose it all depends on how fast we can process incoming IOs and what happens if the overall system is so busy that crucible is not getting the resources that it needs?

match bp {
BackpressureAmount::Saturated => {
let err = "write queue is saturated";
error!(self.log, "{err}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder a few things here.

  1. If we are saturated, what impact will we see from logging a bunch of messages. Like, will it further impact our ability to process IOs? Or, are things so bad at this point that it does not matter that we are spending time logging things?

  2. I do like the idea of knowing this has happened, and if we don't log a message then bumping some internal counter I think would be good as it could help us explain why some VM got a bunch of errors yesterday but now things look okay.

  3. And, finally, how difficult is it to trigger this condition from a VM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. In the saturated case, perhaps we should still sleep for delay_max here before returning the error code? That would limit us to one error every 30 seconds, which seems perfectly reasonable. We could also log an error then delay some long amount of time (e.g. 1 hour), which would match our previous behavior but seems fraught; I worry that delaying a guest by 1 hour would be more confusing than just returning an error.

  2. Agreed, we should definitely record it somewhere if not in an error message

  3. From a VM, you should only see this error condition if writes take > 30 seconds to execute in Crucible. The VM can't really affect this (assuming it's already spamming IOs as quickly as possible); the saturated error indicates that even at maximum backpressure, IOs are not being answered fast enough.

Copy link
Contributor

@leftwo leftwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a pile of backpressure work in flight, so we will be further updates in this area and I think this can go in while we iterate.

@mkeeter mkeeter force-pushed the mkeeter/backpressure-cleanup branch from ea5568e to f0eb4f1 Compare September 5, 2024 15:54
@mkeeter mkeeter merged commit 2869ea2 into main Sep 6, 2024
18 checks passed
@mkeeter mkeeter deleted the mkeeter/backpressure-cleanup branch September 6, 2024 14:11
mkeeter added a commit that referenced this pull request Sep 6, 2024
staged on top of #1442 

Right now, we're manually tracking backpressure bytes by incrementing /
decrementing a counter any time a job becomes active / inactive. This
can be error-prone!

This PR makes tracking backpressure values automatic, through the power
of RAII:

- We add a new `struct BackpressureCounters`, which contains the three
flavors of counters (write bytes, total bytes, job count)
- `BackpressureCounters::increment` increments counters and hands you a
`BackpressureGuard`
- When dropped, the `BackpressureGuard` decrements the counters
automatically
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants