-
Notifications
You must be signed in to change notification settings - Fork 25
Fine-tuning backpressure clamping, and API cleanups #1442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requested some changes but the main logic of the PR makes sense to me
dc3130f
to
97d1fdd
Compare
BackpressureAmount::Saturated => { | ||
let err = "write queue is saturated"; | ||
error!(self.log, "{err}"); | ||
Err(CrucibleError::IoError(err.to_owned())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should have a separate CrucibleError variant, but maybe that only makes sense if there's some meaningful NVMe error type that Propolis can return to the guest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about returning errors rather than retrying forever (albit with longer and longer delays), so I thought I would at least document my concerns to allow for more discussion around them.
At some point, we don't have the memory to keep taking IOs in from the guest, so I guess we can't just keep queueing forever. That being said, what would it take for a guest to generate the traffic to get us saturated? I suppose it all depends on how fast we can process incoming IOs and what happens if the overall system is so busy that crucible is not getting the resources that it needs?
match bp { | ||
BackpressureAmount::Saturated => { | ||
let err = "write queue is saturated"; | ||
error!(self.log, "{err}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder a few things here.
-
If we are saturated, what impact will we see from logging a bunch of messages. Like, will it further impact our ability to process IOs? Or, are things so bad at this point that it does not matter that we are spending time logging things?
-
I do like the idea of knowing this has happened, and if we don't log a message then bumping some internal counter I think would be good as it could help us explain why some VM got a bunch of errors yesterday but now things look okay.
-
And, finally, how difficult is it to trigger this condition from a VM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
In the saturated case, perhaps we should still sleep for
delay_max
here before returning the error code? That would limit us to one error every 30 seconds, which seems perfectly reasonable. We could also log an error then delay some long amount of time (e.g. 1 hour), which would match our previous behavior but seems fraught; I worry that delaying a guest by 1 hour would be more confusing than just returning an error. -
Agreed, we should definitely record it somewhere if not in an error message
-
From a VM, you should only see this error condition if writes take > 30 seconds to execute in Crucible. The VM can't really affect this (assuming it's already spamming IOs as quickly as possible); the saturated error indicates that even at maximum backpressure, IOs are not being answered fast enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have a pile of backpressure work in flight, so we will be further updates in this area and I think this can go in while we iterate.
ea5568e
to
f0eb4f1
Compare
staged on top of #1442 Right now, we're manually tracking backpressure bytes by incrementing / decrementing a counter any time a job becomes active / inactive. This can be error-prone! This PR makes tracking backpressure values automatic, through the power of RAII: - We add a new `struct BackpressureCounters`, which contains the three flavors of counters (write bytes, total bytes, job count) - `BackpressureCounters::increment` increments counters and hands you a `BackpressureGuard` - When dropped, the `BackpressureGuard` decrements the counters automatically
Right now, backpressure goes to infinity as you approach the
max
value. However, it's possible to exceedmax
if you (1) start belowmax
and (2) send a single job that pushes over the limit (e.g. a large write). If this happens, then we special-case a 1 hour backpressure delay.All of this is kinda awkward.
This PR cleans it up:
delay_max
stored in the backpressure configuration, which is both a clamp and the delay used if our counter exceedsmax_value
Saturated
value, which is used if we exceedmax_value