Skip to content

Non-Ascii files get renamed #365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jakobkogler opened this issue Nov 16, 2015 · 3 comments
Closed

Non-Ascii files get renamed #365

jakobkogler opened this issue Nov 16, 2015 · 3 comments

Comments

@jakobkogler
Copy link

I have to work with a Git-Repository, that contains some files with non-ASCII-filenames. And this gives me problems with gitpython. I guess this behavior, that I'm gonna describe, is a bug.

Lets assume we have a Git-Repository. There is one file named Äbc.txt. The file is already staged and committed. And there is another file named test.txt which is not yet staged.

The command git status prints the following:

On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)

    test.txt

When I add the file test.txt to the index with

from git import Repo
repo = Repo()
repo.index.add(['test.txt'])

and afterwards call git status, I receive the following message:

On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    new file:   test.txt
    renamed:    "\303\204bc.txt" -> "\303\204bc.tx"

Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    deleted:    "\303\204bc.tx"

Untracked files:
  (use "git add <file>..." to include in what will be committed)

    "\303\204bc.txt"

What's happened here? This doesn't appear if I call git add test.txt. Why does a file, that I didn't added to the index gets renamed?

@Byron
Copy link
Member

Byron commented Nov 30, 2015

This seems to be an encoding issue, and I wonder what your OS and locale are. On OSX, you can get the locale like so:

$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

I could imagine something like that happens once there are characters which can't be represented by the ascii encoding. A capital A can easily be encoded using that methodi, and should end up being a single byte in utf-8, which is the default encoding.

Additionally, please be sure you use the latest version of GitPython, which is the master branch of this repository.

@jakobkogler
Copy link
Author

I installed the version of the master branch and it fixes this problem. I guess you can close this issue. Sorry for wasting your time.

If you're interested. My system runs Ubuntu 14.04 (trusty) with

$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=de_AT.UTF-8
LC_TIME=de_AT.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=de_AT.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=de_AT.UTF-8
LC_NAME=de_AT.UTF-8
LC_ADDRESS=de_AT.UTF-8
LC_TELEPHONE=de_AT.UTF-8
LC_MEASUREMENT=de_AT.UTF-8
LC_IDENTIFICATION=de_AT.UTF-8
LC_ALL=

@Byron
Copy link
Member

Byron commented Dec 7, 2015

Great to hear, and thanks for letting me know !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants