-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Encoding problems with non-ASCII characters with Git 2.44.0 #4851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I can reproduce this, and work around it by using $ MSYS=disable_pcon git add -p
diff --git a/test.txt b/test.txt
index a3ea8e6..0e54281 100644
--- a/test.txt
+++ b/test.txt
@@ -1 +1,2 @@
äöü
+change
(1/1) Stage this hunk [y,n,q,a,d,e,?]?
$ MSYS=enable_pcon git add -p
diff --git a/test.txt b/test.txt
index a3ea8e6..0e54281 100644
--- a/test.txt
+++ b/test.txt
@@ -1 +1,2 @@
äöü
+change
(1/1) Stage this hunk [y,n,q,a,d,e,?]? I can also work around it by calling $ cmd //c chcp 65001
Active code page: 65001
me@work MINGW64 ~/repros/umlauts-4851 (main)
$ MSYS=enable_pcon git add -p
diff --git a/test.txt b/test.txt
index a3ea8e6..0e54281 100644
--- a/test.txt
+++ b/test.txt
@@ -1 +1,2 @@
äöü
+change
(1/1) Stage this hunk [y,n,q,a,d,e,?]? I guess that we'll want to always change the code page to 65001 in the |
@dscho $ MSYS=disable_pcon git add -p
diff --git a/file.txt b/file.txt
index a3ea8e6..d88e171 100644
--- a/file.txt
+++ b/file.txt
@@ -1 +1,2 @@
├ñ├Â├╝
+Ôé¼
(1/1) Stage this hunk [y,n,q,a,d,e,?]?
|
I added |
In Chinese, VS Code & CMD, same problem!!! |
Same behavior with Git 2.45.0. |
@inosik I have to be honest: due to shifts in priorities at my day job, I am stretched a little too thin to work on this. Maybe you can? It would require a little bit of C++ knowledge (not C# or F#) to work on the MSYS2 runtime, which is a bit tricky to navigate, I'd try my best to assist with guidance. |
I never did any C++ coding, but I could give it a shot. Can you tell me where I should start looking? |
@inosik the first thing would not even be C++, but to verify that dropping the |
Moving |
In #4700, I introduced a change in Git for Windows' behavior where it would favor recent Windows 10 versions' native ANSI sequence processing to [Git for Windows' home-grown one](https://github.com/git-for-windows/git/blob/v2.45.1.windows.1/compat/winansi.c#L362-L439). What I missed was that the home-grown processing _also_ ensured that text written to the Win32 Console was carefully converted from UTF-8 to UTF-16 encoding, while the native ANSI sequence processing would respect the currently-set code page. However, Git for Windows does not use the current code page at all, always using UTF-8 encoded text internally. So let's make sure that the code page is `CP_UTF8` when Git for Windows uses the native ANSI sequence processing. This fixes #4851.
When Git for Windows v2.44.0 introduced the ability [to use native Win32 Console ANSI sequence processing](git-for-windows/git#4700), an inadvertent fallout was that in this instance, [non-ASCII characters were no longer printed correctly unless the current code page was set to 65001](git-for-windows/git#4851). This bug [has been fixed](git-for-windows/git#4968). Signed-off-by: gitforwindowshelper[bot] <[email protected]>
I've downloaded the snapshot from yesterday (2024-05-26), and this bug seems to be fixed for me. Thank you @dscho! I've tried it inside a Windows 10 VM as well, which is a bit different of a setup than my actual machine. In the VM, the bug seems to still be reproducible, but this also might have to do with something else. Here's a screenshot: |
Looks like the diff suggests ISO-8859-1 encoding. You may need to specify the |
There's something wrong w.r.t. encoding in Git Bash. Non-ASCII characters (German Umlauts in my case) appear garbled in the terminal output.
Git 2.43.0:
Git 2.44.0:
Setup
defaults?
I installed Git using Scoop, which just extracts
PortableGit-2.44.0-64-bit.7z.exe
.to the issue you're seeing?
Nothing I can think of.
Details
Bash inside of the Windows Terminal App. But this is also reproducible using CMD, PowerShell and git-bash.exe.
Minimal, Complete, and Verifiable example
this will help us understand the issue.
Non-ASCII characters should be displayed properly .
Non-ASCII characters are garbled.
URL to that repository to help us with testing?
No specific repository, but here's a Gist with the test file.
The text was updated successfully, but these errors were encountered: