You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When directly fetching a ref without specifying a destination ref. E.g. repo.remotes.origin.fetch(["master"]) instead of repo.remotes.origin.fetch(["refs/heads/master:refs/remotes/origin/master"]) git will produce a different amount of ref-update lines on stderr than in .git/FETCH_HEAD. For this example it would produce both these lines:
.git/FETCH_HEAD however will (kind of obviously) not contain anything to match the -> FETCH_HEAD lines (there can be multiple, more on that later).
037d62a9966743cf7130193fa08d5182df251b27 branch 'master' of https://github.com/gitpython-developers/GitPython
This mismatch causes the truncation code from 1537aab to kick in. Unfortunately, because the mismatched output (extra -> FETCH_HEAD lines that don't appear in the FETCH_HEAD file itself) appear in the front this truncation code leaves the stderr lines and .git/FETCH_HEAD lines mismatched. In this particular use case that's easily remedied by truncating the front of the list instead of the back:
diff --git a/git/remote.py b/git/remote.py
index 65916614..72c15ddc 100644
--- a/git/remote.py+++ b/git/remote.py@@ -698,9 +698,9 @@ class Remote(LazyMixin, Iterable):
log.debug("info lines: " + str(fetch_info_lines))
log.debug("head info : " + str(fetch_head_info))
if l_fil < l_fhi:
- fetch_head_info = fetch_head_info[:l_fil]+ fetch_head_info = fetch_head_info[l_fhi - l_fil :]
else:
- fetch_info_lines = fetch_info_lines[:l_fhi]+ fetch_info_lines = fetch_info_lines[l_fil - l_fhi :]
# end truncate correct list
# end sanity check + sanitization
But I'm unsure about submitting this patch as a PR because it might end up causing problems in case that I'm unaware of, where truncating from the end is the correct thing to do (or maybe fetch_head_info should be truncated at the back and only fetch_info_lines at the front...).
Additionally the -> FETCH_HEAD lines seem to appear on stderr once for every refspec without a dst segment (given the refspec syntax [+]<src>[:<dst>] from git help fetch). But, when fetching the same refspec without dst twice, the corresponding dd3cdfc9..037d62a9 master -> origin/master line that signals the update of the local remote ref (.git/refs/remotes/*) appears only once on stderr while it does generate two lines in .git/FETCH_HEAD.
As a result of this I'm convinced that the attempt to try to match the stderr output to the produced .git/FETCH_HEAD is unlikely to be made to work in every scenario...
FYI: for me it's more important that the return value of remote.fetch has each of its .ref members correspond one-to-one with FETCH_HEAD. The above patch would accomplish that ifflen(fetch_head_info) <= len(fetch_info_lines). So it's at least an improvement for my needs as I would never get the wrong info/mismatched info. But I might still miss info if there are less lines discovered on stderr than in .git/FETCH_HEAD. So my preference would be to still keep processing .git/FETCH_HEAD lines, even when we've ran out of stderr lines.
So in short: my naive expectation of remote.fetch is that it returns one FetchInfo entry per ref entry in .git/FETCH_HEAD. All the extra metadata about those ref entries I see as nice to have and it would be acceptable for me to not have access to that under some circumstances.
The text was updated successfully, but these errors were encountered:
Thanks so much for taking the time to investigate the implementation!
I couldn't agree more that there isn't always a perfect solution to obtain information on which refs got updated, so the only thing one could hope here is to make it a little less imperfect.
Even though I would definitely love to see this improvement in mainline, I also agree with the sentiment that it might break in unpredictable ways.
Maybe there is a middle ground - would it be possible to factor the ref handling into its own implementation and allow others to inject their own? Or to make it a little less complex, provide another method to fetch/pull but with a different algorithm to allow testing in the wild?
That way the existing algorithm could remain stable for now without being in the way of progress.
I like your idea for making it possible to replace the ref-handling implementation. That should allow some experimentation with this to incrementally get a better implementation instead of having to get it right immediately.
I think that it's better to parametrize the ref-handling implementation than to add an extra method.
Great to hear! Yes, going ahead with parameterizing the method should do just fine. Once the new way of doing the work is proven to work, it could even become the default one day.
When directly fetching a ref without specifying a destination ref. E.g.
repo.remotes.origin.fetch(["master"])
instead ofrepo.remotes.origin.fetch(["refs/heads/master:refs/remotes/origin/master"])
git will produce a different amount of ref-update lines on stderr than in.git/FETCH_HEAD
. For this example it would produce both these lines:.git/FETCH_HEAD
however will (kind of obviously) not contain anything to match the-> FETCH_HEAD
lines (there can be multiple, more on that later).This mismatch causes the truncation code from 1537aab to kick in. Unfortunately, because the mismatched output (extra
-> FETCH_HEAD
lines that don't appear in theFETCH_HEAD
file itself) appear in the front this truncation code leaves thestderr
lines and.git/FETCH_HEAD
lines mismatched. In this particular use case that's easily remedied by truncating the front of the list instead of the back:But I'm unsure about submitting this patch as a PR because it might end up causing problems in case that I'm unaware of, where truncating from the end is the correct thing to do (or maybe
fetch_head_info
should be truncated at the back and onlyfetch_info_lines
at the front...).Additionally the
-> FETCH_HEAD
lines seem to appear onstderr
once for every refspec without adst
segment (given the refspec syntax[+]<src>[:<dst>]
fromgit help fetch
). But, when fetching the same refspec withoutdst
twice, the correspondingdd3cdfc9..037d62a9 master -> origin/master
line that signals the update of the local remote ref (.git/refs/remotes/*
) appears only once onstderr
while it does generate two lines in.git/FETCH_HEAD
.As a result of this I'm convinced that the attempt to try to match the
stderr
output to the produced.git/FETCH_HEAD
is unlikely to be made to work in every scenario...FYI: for me it's more important that the return value of
remote.fetch
has each of its.ref
members correspond one-to-one withFETCH_HEAD
. The above patch would accomplish that ifflen(fetch_head_info) <= len(fetch_info_lines)
. So it's at least an improvement for my needs as I would never get the wrong info/mismatched info. But I might still miss info if there are less lines discovered onstderr
than in.git/FETCH_HEAD
. So my preference would be to still keep processing.git/FETCH_HEAD
lines, even when we've ran out ofstderr
lines.So in short: my naive expectation of
remote.fetch
is that it returns oneFetchInfo
entry perref
entry in.git/FETCH_HEAD
. All the extra metadata about thoseref
entries I see as nice to have and it would be acceptable for me to not have access to that under some circumstances.The text was updated successfully, but these errors were encountered: