Skip to content

Conversation

dscho
Copy link
Member

@dscho dscho commented Aug 13, 2025

As of 9e59b38, Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted objects. But that's exactly what we want to do in VFS for Git. This is even tested for in the functional tests of VFS for Git, which has been identified as a regression in microsoft/VFSForGit#1853.

@dscho dscho requested a review from mjcheetham August 13, 2025 13:46
@dscho dscho self-assigned this Aug 13, 2025
dscho added a commit to microsoft/VFSForGit that referenced this pull request Aug 13, 2025
I am letting GitHub Actions build a test version for that PR as I am
writing this.

Signed-off-by: Johannes Schindelin <[email protected]>
dscho added a commit to microsoft/VFSForGit that referenced this pull request Aug 13, 2025
Let's let it run on the `tmp` branch.

Signed-off-by: Johannes Schindelin <[email protected]>
dscho added a commit to microsoft/VFSForGit that referenced this pull request Aug 13, 2025
I am letting GitHub Actions build a test version for that PR as I am
writing this.

Signed-off-by: Johannes Schindelin <[email protected]>
derrickstolee
derrickstolee previously approved these changes Aug 13, 2025
Copy link

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Is there any way we can test this without a hop to microsoft/vfsforgit?

@dscho
Copy link
Member Author

dscho commented Aug 13, 2025

LGTM. Is there any way we can test this without a hop to microsoft/vfsforgit?

Hmm. Valid point. The sad news is that the VFS for Git test case was introduced in microsoft/VFSForGit@2db0c03, which is one of those absurdly big and lamentably under-documented commits that I frequently use as examples why code changes really need to be crafted with skill.

But I realize that I even failed to target the correct commit with the fixup! in this PR, as there simply is no commit that actually deals with corrupt loose objects. Will reconsider.

If the `read-object` hook is not found, the code currently would fail
with the rather obscure message:

	fatal: Out of memory, strdup failed

The reason is because without a check whether the hook was found,
eventually `strvec_push()` would try to call `xstrdup()` with `cmd`
(which is `NULL` if the hook was not found) which would fail and print
that rather misleading message.

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho force-pushed the do-re-download-corrupt-objects-with-gvfs branch from fd34fe0 to cda6dd6 Compare August 13, 2025 19:29
@dscho
Copy link
Member Author

dscho commented Aug 13, 2025

@derrickstolee thank you for helping me be a better engineer. I thought that I had a really good reproducer, but I hadn't; My initial draft of a test case reused the corrupt_byte function of t1060, which replaces the 10th byte by a NUL, but that would not trigger the same code path as VFS for Git's previously-failing test case, and would actually not require any fix to let the read-object hook re-download the corrupt object. Then I tried to replace the loose object by a 0-sized file, but same thing: The code path was lenient enough to give the read-object hook a job and everything worked without the fix. In the end, I painstakingly imitated the VFS for Git approach, where the loose object file is truncated by 8 bytes. That did it.

This experience of trying to find a way to trigger the same code path that does require the fix, however, kind of makes me think that 9e59b38, despite the claims in its commit message, is doing an incomplete job of reporting corrupted objects. Oh well. Maybe some day, when I have some free time...

In the meantime, @derrickstolee, could I solicit another review, please?

@dscho
Copy link
Member Author

dscho commented Aug 13, 2025

Ach, darn it. macOS' sed strikes again.

As of 9e59b38 (object-file: emit corruption errors when detected,
2022-12-14), Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git via the
`read-object` hook, as per the `GitCorruptObjectTests` code
added in microsoft/VFSForGit@2db0c030eb25 (New
features: [...] -  GVFS can now recover from corrupted git object files
[...] , 2018-02-16).

So let's support precisely that, and add a regression test that ensures
that re-downloading corrupt objects via the `read-object` hook works.

While at it, avoid the XOR operator to flip the bits, when we actually
want to make sure that they are turned off: Use the AND-NOT operator for
that purpose.

Helped-by: Matthew John Cheetham <[email protected]>
Helped-by: Derrick Stolee <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho force-pushed the do-re-download-corrupt-objects-with-gvfs branch from cda6dd6 to db6f8f4 Compare August 13, 2025 20:05
@dscho dscho marked this pull request as ready for review August 13, 2025 22:09
@dscho dscho merged commit b65c8e1 into vfs-2.50.1 Aug 13, 2025
180 of 182 checks passed
@dscho dscho deleted the do-re-download-corrupt-objects-with-gvfs branch August 13, 2025 22:09
dscho added a commit that referenced this pull request Aug 13, 2025
As of 9e59b38, Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git. This is
even tested for in the functional tests of VFS for Git, which has been
identified as a regression in
microsoft/VFSForGit#1853.
dscho added a commit that referenced this pull request Aug 13, 2025
As of 9e59b38, Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git. This is
even tested for in the functional tests of VFS for Git, which has been
identified as a regression in
microsoft/VFSForGit#1853.
dscho added a commit that referenced this pull request Aug 13, 2025
As of 9e59b38, Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git. This is
even tested for in the functional tests of VFS for Git, which has been
identified as a regression in
microsoft/VFSForGit#1853.
@dscho dscho mentioned this pull request Aug 13, 2025
dscho added a commit that referenced this pull request Aug 13, 2025
As of 9e59b38, Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git. This is
even tested for in the functional tests of VFS for Git, which has been
identified as a regression in
microsoft/VFSForGit#1853.
@dscho dscho mentioned this pull request Aug 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants