You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey @Byron,
just another UnicodeDecodeError here :) this time in the filename of a diff object. Originally opened here.
This is the culprit.
You can repro with this code:
from git import Repo
r = Repo("/tmp/brackets")
commit = r.commit("8b3ae041041dfeecd059c2b19c72e76223e501d3")
diff = commit.parents[0].diff(commit, create_patch=True)
The error:
File ".../tmp.py", line 13, in <module>
diff = commit.parents[0].diff(commit, create_patch=True)
File ".../venv/lib/python3.8/site-packages/git/diff.py", line 145, in diff
index = diff_method(self.repo, proc)
File ".../venv/lib/python3.8/site-packages/git/diff.py", line 455, in _index_from_patch_format
index.append(Diff(repo,
File ".../venv/lib/python3.8/site-packages/git/diff.py", line 282, in __init__
if submodule.path == a_rawpath.decode("utf-8"):
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 54: invalid continuation byte
The error is on this line. We don't decode correctly the filename.
Now, we have many fixes..which one do you prefer?
I saw on the same file you use many times the 'replace' (like here). I can open a PR with this change if you think it's a good idea.
The text was updated successfully, but these errors were encountered:
A PR with a similar fix would be appreciated as at least it would be consistent. However, if there is a better way (without breaking backwards compatibility), that should certainly be considered as well. To my mind the right way to deal with this is to not actually assume any encoding but to work with bytes only. gitoxide is getting that right as you can imagine :D.
Indeed, working with bytes instead of unicodes would be the best..especially now that we dropped py2 support. Also, git returns bytes, I think..so instead of decoding it, we just give back bytes to the users.
Though I think this will break pretty much everything, tests and users (adding a 'b' in front of all strings of all tests might not be super complicated, but still... 😄 ) For now I will just add the "replace" flag in the deconding.
Great, thanks for the fix.
Indeed, changing everything to bytes would be best, but probably also force quite a lot of rework of any user of GitPython, making me hesitant to consider going down that road.
If everything breaks, maybe replace GitPython with something considerably better.
Hey @Byron,
just another UnicodeDecodeError here :) this time in the filename of a diff object. Originally opened here.
This is the culprit.
You can repro with this code:
The error:
The error is on this line. We don't decode correctly the filename.
Now, we have many fixes..which one do you prefer?
I saw on the same file you use many times the 'replace' (like here). I can open a PR with this change if you think it's a good idea.
The text was updated successfully, but these errors were encountered: