-
-
Notifications
You must be signed in to change notification settings - Fork 51
audio: change "low latency" buffer parameters #140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This helps a lot when debugging various underruns.
FYI @DemiMarie I've added logging I used for graphs here. Since it proved useful, I'll make it opt-in in default binary too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am quite concerned about a latency regression here. I would prefer to set .maxlength
to 4096, while keeping .tlength
at 4096.
It isn't regression at all. maxlength used to be server default, which IIUC is about 4MB. PA agent used smaller vchan buffer, so it didn't manage to fill the whole PA's buffer, but it was still 8k. |
And generally (if that would be the case, which isn't) no underruns takes priority over low latency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffer attribute changes look fine, but if pa_stream_get_latency()
fails the code will log either wrong data or uninitialized stack rubble.
if (pa_stream_get_latency(s, &latency, &negative)) | ||
pacat_log("pa_stream_get_latency() failed"); | ||
|
||
pacat_log("process_playback_data(): vchan data %d max_length %d latency %llu", space_in_vchan, max_length, latency); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (pa_stream_get_latency(s, &latency, &negative)) | |
pacat_log("pa_stream_get_latency() failed"); | |
pacat_log("process_playback_data(): vchan data %d max_length %d latency %llu", space_in_vchan, max_length, latency); | |
if (pa_stream_get_latency(s, &latency, &negative)) | |
pacat_log("pa_stream_get_latency() failed"); | |
else | |
pacat_log("process_playback_data(): vchan data %d max_length %d latency %llu", space_in_vchan, max_length, latency); |
otherwise we might log uninitialized stack rubble
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but a) I still want other values logged, and b) thanks to PA_STREAM_INTERPOLATE_TIMING
that happens basically only once at the very start.
379caf4
to
da6af33
Compare
@@ -210,6 +215,16 @@ static void process_playback_data(struct userdata *u, pa_stream *s, size_t max_l | |||
return; | |||
} | |||
|
|||
if (verbose > 1) { | |||
pa_usec_t latency = 0; | |||
int negative; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int negative; | |
int negative = 0; |
avoid logging junk from the stack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't use the value at all, so it doesn't matter.
OpenQA test summaryComplete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.2&build=2024040413-4.2&flavor=pull-requests New failures, excluding unstableCompared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.2&build=2024031904-4.2&flavor=update
Failed tests78 failures
Fixed failuresCompared to: https://openqa.qubes-os.org/tests/94176#dependencies 34 fixed
Unstable tests
|
This is a regression. Those tests are about emulated sound card for HVM, so PipeWire agent is not involved. But PulseAudio agent running in a stubdomain is. Depending on a test, both playback and recording failed. I'll inspect logs more closely and experiment with some parameters, but I suspect changing maxlength for this case is a mistake. But it could be also related to migration to PipeWire in dom0, not this PR. |
The HVM playback was problem in a test (yet another difference between Pulseaudio and PipeWire). But HVM recording seems to be legit issue. |
Want me to make the changes conditional on buffer sizes? The PipeWire agent uses much bigger buffers. |
If that will be necessary, I can also add an option for that (qvm-start-daemon knows very well if given pacat process is for the domain itself or its stubdomain). |
I thought the problem was with PulseAudio vs PipeWire in the guest. |
I have an alternative idea - instead of changing buffer parameters, change how much data is read from the vchan, using this:
|
@marmarek The code already does this for reads initiated by vchan activity. |
Hmm, I'm trying to get what is the size (nbytes) parameter to the write callback. In our code it's called |
Checking the source indicates that it is the same value that |
Ok, then it wouldn't change anything. |
But in that case, my change tlength->maxlength shouldn't matter for playback. But as the failing recording test shows, it does matter for recording, in negative way. I'll rollback the change, but maybe make tlength configurable? Oh, and also, fragsize 4096 is incompatible with PA's agent 2048 vchan size, and it works only due to blocking vchan calls. I'll try changing to 2048 and see if that helps. |
or better change PA's agent vchan to 4096 ? |
In any case, this shows rather clearly that prebuf/tlength handling in Pulseaudio is broken... |
Log timing information about playback stream, used for debugging underruns.
da6af33
to
ffb51fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strtoul()
is a terrible API, sorry…
pulse/pacat-simple-vchan.c
Outdated
case 't': | ||
bufattr = &custom_bufattr; | ||
errno = 0; | ||
custom_bufattr.tlength = strtoul(optarg, &endptr, 0); | ||
if (*endptr || errno == ERANGE) | ||
err(1, "Invalid -t argument: %s", optarg); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case 't': | |
bufattr = &custom_bufattr; | |
errno = 0; | |
custom_bufattr.tlength = strtoul(optarg, &endptr, 0); | |
if (*endptr || errno == ERANGE) | |
err(1, "Invalid -t argument: %s", optarg); | |
break; | |
case 't': | |
bufattr = &custom_bufattr; | |
errno = 0; | |
unsigned long p = strtoul(optarg, &endptr, 0); | |
if (*endptr) | |
errx(1, "Invalid -t argument: %s", optarg); | |
if (p > UINT32_MAX) | |
errno = ERANGE; | |
if (errno) | |
err(1, "Invalid -t argument: %s", optarg); | |
break; |
ffb51fa
to
6dce3bf
Compare
PipeWire audio agent synchronizes to pacat-simple-vchan rate of taking data out of vchan and wants to ensure it never ends empty. Pulseaudio and PipeWire seems to handle them differently (looks like Pulseaudio is less accurate in this regard). With PipeWire in dom0, it is closer to what is configured, and theoretically could lead to underruns. Make it configurable (beyond just disabling low latency mode), so it's easier to mitigate issues. Note also that PA_STREAM_ADJUST_LATENCY is enabled, so the tlength avalue includes hardware buffer size, which may differ from system to system. And also, tlength affects default prebuf value. While at it, reduce fragsize for recording. Pulseaudio agent (used in older distributions, and in a stubdomain) uses vchan of 2048 bytes, so record stream chunked into 4096 doesn't fit. It works only because pacat-simple-vchan uses blocking vchan writes, but that may negatively impact parallel playback stream. And also, smaller fragsize may reduce latency (although it was good enough already). QubesOS/qubes-issues#8955
6dce3bf
to
0be59e2
Compare
Or maybe another issue with tests? I found this in the log:
So, recording was enabled quite late in the test, no wonder it recorded shorter sound than intended (or nothing at all). But also, there was a few underruns, might be relevant. |
The HVM recording on debian-12 seems to be a pipewire bug (see QubesOS/qubes-issues#8590). It's fixed in newer pipewire (and so the test passes on fedora template). It was "working" before because the test was broken (and I got it fixed in QubesOS/qubes-core-admin#581). I'll exclude it for now. Remaining thing to check is recording on whonix-workstation template - which runs pulseaudio (not pipewire-pulse). |
I'm struggling with this case. It looks like with pipewire in dom0 it fails, but with pulseaudio in dom0 it works. More specifically, the failing setup is:
I've tried changing various parameters, like fragsize (both down and up), and also updated PA agent inside stubdomain (so it uses 8KiB vchan buffer instead of 2KiB). Nothing helped. What does help is switching dom0 audio daemon back to pulseaudio. The test passes also with (new, fedora-38) pipewire running inside the qube. The result is some missing samples, mostly in the middle. And the test fails with: The specific setup above is unusual enough (especially the nested virt) that it shouldn't matter in theory. But very obvious issues in this setup usually correlated with more subtle issues on more usual configurations. And it's usually easier to debug more obvious issue. What I worry here, is that switching audio daemon in dom0 will regress audio recording for non-Linux HVMs (or even Linux, but without qubes packages). Any ideas what else I can try? |
|
That is not the point... I fear what we see here is a symptom of broken audio in HVM in general, that will manifest in other cases, maybe even not Linux.
The former I don't like, due to significantly higher memory usage for PipeWire, which is even more critical in stubdomain. The latter might be an option in theory, but given how "well" it went with pipewire agent implementation, I'm skeptical we will improve the situation this way. |
Oh, and if I retry the test few more times, it fails with new pipewire in the VM too. But after switching to pulseaudio in dom0 it works every time. |
As far as Qubes-Whonix is concerned, yes, very much so. This is unspecific to Qubes-Whonix. (Non-Qubes-Whonix is already on PipeWire.) qubes-whonix |
So I think the test matrix is:
Which of these 32 possibilities are good and which are bad? Is there an openQA run where I can look at the results of all of these? My understanding is:
Is this accurate? I’m somewhat confused as to what is happening here, which is why I am asking for the results in all 32 possible cases. |
I think it should be possible to collect all the combinations. Take a most recent run: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.2&build=2024022504-4.2&flavor=pull-requests click on vm_qrexec_gui tests
"rec" or "play" in test name
recent runs use pipewire in dom0; for pulseaudio in dom0 you need an earlier run, for example this: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.2&build=2024021900-4.2&groupid=11; but note the audio_rec_unmuted_hvm test wasn't fixed at that time, so it's skipped in many cases)
the one with "@hw1" is on bare hardware, others are nested virt
"hvm" in test name is emulated, others are qubes pv
template:
Especially for nested virt, check few jobs (the whole vm_qrexec_gui suite is ran several times with different storage, so you have quite a few data points) - a single passing instance doesn't necessarily mean it's good. Failures are collected in qubesos-bot message in relevant PR (this included), you can check edits history for older reports - that's usually quicker way than manually clicking on all the jobs. Oh, and one more thing, failure "only silence detected" is not interesting - it's either buggy test, or the known issue in the old pipewire.
For PV audio - yes, both bare hw and nested virt is good. And the single underrun at the start is not a problem (looks like an artifact prebuf kicking in), so that PR isn't needed IMO.
Large latencies and a lot of underruns for several seconds after starting playback on nested virt (and some less on bare hw, but still problematic). Neither of those problem affect dom0 PA + domU PA setup, only dom0 PA + domU PW is affected.
Yes.
Recording using emulated audio is affected. And on the very last run, I see a single failure of emulated playback (with pulseaudio in domU). |
I’d still like to eliminate the underrun, mostly because it can hide real underruns (not due to prebuf) which indicate actual problems. I vaguely remember this being a problem when developing the PipeWire agent. |
What I don't like about that PR is second-guessing what dom0 part is doing, including some variable parameters (like buffer size to fill in). That will lead to issues sooner or latter (for example, if we'd reduce pacat buffer sizes at some point in the future, the prefill PR will cause overruns at start). If pacat needs to delay uncork until at least some amount of data is collected, it's the job of pacat to do that, not domU agent. And the prebuf feature does exactly that. I'm not sure why I haven't managed to get it working with real Pulseaudio (it could be a bug in PA, but very well it could be also some other bug in pacat), but with PipeWire in dom0 it seems to work correctly. |
Related to QubesOS/qubes-issues#8955