Skip to content

Conversation

@jc-kynesim
Copy link
Contributor

The deinterlacer requires a height alignment of 16 so add it to the list of roles that get fixed up.

Signed-off-by: John Cox [email protected]

The deinterlacer requires a height alignment of 16 so add it to the list
of roles that get fixed up.

Signed-off-by: John Cox <[email protected]>
@popcornmix
Copy link
Collaborator

How is this failing without the change?
Does it reject the non-aligned height, or does it deinterlace with a geometry you are not expecting?
Are you having the issue on the input or output side?

I have a suspicion that ffmpeg produces non-16 aligned height images and we are currently deinterlacing those fine, and this change may stop that working. But I'd need to check to be sure.

@6by9
Copy link
Contributor

6by9 commented Sep 20, 2022

I'll let John confirm, but he was saying that setting 1920x1080 was failing as V4L2 accepted it at S_FMT, but the firmware then rejected it.

Checking the firmware, it does claim to only need an alignment to a multiple of 2 lines, so I guess something else is at play.
There's a patch from popcornmix that changes the advertised alignment.

   if(!(width > 0 &&
        height > 0 &&
        ((*slice & 15) == 0 || *slice >= ALIGN_UP(height, 16)) /*&&
        (*stride) == pitch*/))
      fmt = NULL;

So the slice height must either be a multiple of 16, or greater than the aligned up full height.
1080 doesn't comply with either requirement, and so is rejected.

@popcornmix
Copy link
Collaborator

The commit you reference says:

image_fx: Relax the vertical pitch requirement

Vector code often uses REP 16, so we traditionally require vertical pitch to be a multiple of 16, but any value higher is also fine. ffmpeg actually rounds height up to 16 then adds 2 for some codecs

So I think this commit is breaking this case.
I think the aligned height should only be rounded up if it is < ALIGN_UP(unaligned_height, 16) which should fix the firmware check, and not break ffmpeg's ALIGN_UP(height, 16) + 2 height.

@jc-kynesim
Copy link
Contributor Author

jc-kynesim commented Sep 21, 2022

I found the issue as I'd built an ISP resize component and the easiest way was to common it with the deinterlace code (apart from some extra parameter setting for the ISP the code is all but identical for the two operations). This (fairly obviously) required me to have the ability to set different sizes on src & dest. The old code copied src geometry to dest geometry, the new simply sets the desired dest size and relies on S_FMT to fix up anything that is wrong (which, as I understand it, is the V4L2 way - and makes sense too). Src is set before dest so if the filter needs identical src & dest then it has the opportunity to enforce that.

The S_FMTs both work - and so do the S_SELECTIONs which define the actual active area within the frame geometry. The call that fails is the REQBUFS on the dest (which I do before src but I can't see why that would matter). It fails with an EINVAL because (according to the V4L2 logging) the firmware rejects the src port format set. This shouldn't happen. S_FMT should not be able to create a format that won't be accepted by the firmware.

I've tested the above fix against my new code and it does work.

@6by9
Copy link
Contributor

6by9 commented Sep 21, 2022

The call that fails is the REQBUFS on the dest (which I do before src but I can't see why that would matter). It fails with an EINVAL because (according to the V4L2 logging) the firmware rejects the src port format set. This shouldn't happen. S_FMT should not be able to create a format that won't be accepted by the firmware.

REQBUFS calls bcm2835_codec_queue_setup, which creates the component and sets up the ports.
The V4L2 S_FMT handling is meant to match that of the firmware, but as it doesn't necessarily have a component/port to set it has to replicate that logic accurately, and it does vary between components.
In this cases it seems that is not an accurate duplicate of the logic, so at the point that the format is set on the firmware, it fails.

@6by9
Copy link
Contributor

6by9 commented Sep 21, 2022

	if (ctx->dev->role == DECODE ||
	    ctx->dev->role == ENCODE_IMAGE) {
		f->fmt.pix_mp.height = ALIGN(f->fmt.pix_mp.height, 16);
	} else if (ctx->dev->role == INTERLACE) {
		if (f->fmt.pix_mp.height < ALIGN(f->fmt.pix_mp.height, 16)
			f->fmt.pix_mp.height = ALIGN(f->fmt.pix_mp.height, 16);
	}

I think that caters for your case, and still allows FFmpeg's ALIGN_UP(height, 16) + 2.

@jc-kynesim
Copy link
Contributor Author

Ummm... I may be being thick but surely if f->fmt.pix_mp.height is a multiple of 16 then ALIGN(f->fmt.pix_mp.height, 16) == f->fmt.pix_mp.height and that is the only case where the "if" will not be true and so the above code is equivalent to doing the ALIGN unconditionally?

@6by9
Copy link
Contributor

6by9 commented Sep 21, 2022

Ah, I want to be looking at the selection height, not the buffer height.

Except that of course comes in through a totally different API (s_selection). How to rationalise that isn't clear (you could trim down the selection during s_selection, but that then leaves a very odd situation).

@jc-kynesim
Copy link
Contributor Author

That is, at best, tricky. S_FMT must, I think, happen before S_SELECTION and must return the numbers that are actually going to be used.

@6by9
Copy link
Contributor

6by9 commented Sep 21, 2022

That is, at best, tricky. S_FMT must, I think, happen before S_SELECTION and must return the numbers that are actually going to be used.

Indeed.
The best that can be done is to mangle the selection, except then userspace has to know to pad the S_FMT in order to get the S_SELECTION to work at the desired resolution.

@jc-kynesim
Copy link
Contributor Author

If you do S_FMT without S_SELECTION you will expect the passed width/height to be valid. S_SELECTION is not required before REQBUFS.

@6by9
Copy link
Contributor

6by9 commented Sep 21, 2022

If you do S_FMT without S_SELECTION you will expect the passed width/height to be valid. S_SELECTION is not required before REQBUFS.

Indeed.
S_FMT is allowed to modify S_SELECTION too, so it would need to do that.
ie S_FMT 1920x1080 would result in fmt of 1920x1080, but selection of 1920x1072. Calling S_SELECTION 1920x1080 after that S_FMT would also return 1920x1072. Ugly, ugly, ugly!

@jc-kynesim
Copy link
Contributor Author

Going back to the original thing. I am not aware of any general requirement that ffmpeg frames are over allocated as align(height,16)+2, we certainly don't do that on h/w decode for either stateful or stateless. Some s/w decoders do that on their dest frames 'cos it is convenient to have a line of edge replication for inter prediction (I did that ages ago). The current deinterlacer doesn't do it either. So if all of that works then I think we are inventing problems.

@popcornmix
Copy link
Collaborator

See https://ffmpeg.org/doxygen/trunk/libavcodec_2utils_8c_source.html#l00141
for AV_PIX_FMT_YUV420P:
h_align = 16 * 2; // interlaced needs 2 macroblocks height
and
*height = FFALIGN(*height, h_align);
for H264:
*height += 2;

@jc-kynesim
Copy link
Contributor Author

The other thing that is worth mentioning is that as it stands v4l2m2m deinterlace & scale are drmprime only src & dest so their interaction with s/w components is quite limited. (deinterlace advertised s/w YUV420 support but didn't actually do it).

@jc-kynesim
Copy link
Contributor Author

That function is for returning appropriate buffer sizes s.t. that an external allocator can feed the correct sizes into an existing s/w decoders which often require greater allocs - they don't prescribe the sizes of anything else.
Now you could create a semi-plausible argument that I should use something similar for creation of dest buffers, but this is a filter and doesn't have an AVCodecContext. Briefly ignoring that and assuming that we did need to create such buffers, this is fixed by feeding the enlarged size into S_FMT and the actual size into S_SELECTION. V4L2 should not need to worry about ffmpeg oddities.

@popcornmix
Copy link
Collaborator

popcornmix commented Sep 21, 2022

I've just tested standard rpios bullseye, with kodi installed through apt and hw decode disabled.
Many interlaced files on TestMedia no longer play with this PR (e.g. Paw Patrol.ts, MONSTA_deinterlace_jaggies.mkv, h264-interlaced-sd.ts), but do play without this PR.

Note when playing successfully, kodi reports:

Video decoder: ff-h264-drm_prime (SW)
Pixel format: yuv420p
Deinterlace Method: deinterlace_v4l2m2m

@jc-kynesim
Copy link
Contributor Author

OK fair enough - how does decoder frame alloc occur in this scenario? I'm pretty sure that the deinterlace doesn't have an actual s/w frame import and requires a drmprime frame.
Maybe we could have different rules on src & dest? (I don't know how closely they need to match for deinterlace).

@popcornmix
Copy link
Collaborator

We use avcodec_align_dimensions2 to create the geometry of the frames ffmpeg decodes into (from a get_buffer2 callback).
That is the buffer that has the aligned_height+2 requirement and what is fed to the deinterlace component.

I don't believe I specify or care about the geometry of the deinterlace output - from kodi's view that is just a handle that just gets displayed.

@jc-kynesim
Copy link
Contributor Author

Well I don't know what the correct answer is. The current code is wrong - you shouldn't be able to have a successful S_FMT that causes REQBUFS to fail. Having selection limits that are less than S_FMT width/height is definitely unexpected even if legal.
Maybe the correct answer is for me to fix my ffmpeg code s.t. it works (I have no real issue with rounding up my dest sizes & I half thought it would be a good idea anyway), stick our fingers in our ears, go "la, la, la" and ignore the problem.

@6by9
Copy link
Contributor

6by9 commented Sep 21, 2022

The only fix to avoid REQBUFS failing is to "correct" the selection in S_SELECTION.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants