Skip to content

init: fails on 2nd run with NTFS on Linux (gitdb lib bug) #1880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
villasv opened this issue Apr 12, 2019 · 18 comments
Closed

init: fails on 2nd run with NTFS on Linux (gitdb lib bug) #1880

villasv opened this issue Apr 12, 2019 · 18 comments
Assignees
Labels
bug Did we break something? help wanted p3-nice-to-have It should be done this or next sprint

Comments

@villasv
Copy link
Contributor

villasv commented Apr 12, 2019

DVC Version: 0.35.7

lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 16.04.5 LTS
Release: 16.04
Codename: xenial

Steps to reproduce:

git init
dvc init
rm -rf .dvc
dvc init
Adding '.dvc/state' to '.dvc/.gitignore'.
Adding '.dvc/lock' to '.dvc/.gitignore'.
Adding '.dvc/config.local' to '.dvc/.gitignore'.
Adding '.dvc/updater' to '.dvc/.gitignore'.
Adding '.dvc/updater.lock' to '.dvc/.gitignore'.
Adding '.dvc/state-journal' to '.dvc/.gitignore'.
Adding '.dvc/state-wal' to '.dvc/.gitignore'.
Adding '.dvc/cache' to '.dvc/.gitignore'.
ERROR: unexpected error - [Errno 13] Permission denied: '/mnt/c/Users/villasv/Projects/dvc-talk/.git/objects/obj80fx1hnj' -> '/mnt/c/Users/villasv/Projects/dvc-talk/.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391'

Having any troubles?. Hit us up at https://dvc.org/support, we are always happy to help!
@efiop
Copy link
Contributor

efiop commented Apr 12, 2019

@villasv Hm, very interesting. I am not able to reproduce this myself 🙁 Does this script still reproduce that for you?

#!/bin/bash   
              
set -e        
set -x        
              
rm -rf myrepo 
mkdir myrepo  
cd myrepo     
              
git init      
dvc init      
rm -rf .dvc   
dvc init -v

Notice I've added -v to the last dvc init so we could hopefully have some more traceback.

Is there anything special about your scenario?

Thanks,
Ruslan

@villasv
Copy link
Contributor Author

villasv commented Apr 15, 2019

I'm using Bash on WSL, so maybe something funny is happening at the filesystem level?
I was able to reproduce (by accident) with an old version as well, dvc==0.9.5 I think, so it's probably something with gitpython.

+ rm -rf myrepo
+ mkdir myrepo
+ cd myrepo
+ git init
Initialized empty Git repository in /mnt/c/Users/villasv/Projects/myrepo/.git/
+ dvc init
Adding '.dvc/state' to '.dvc/.gitignore'.
Adding '.dvc/lock' to '.dvc/.gitignore'.
Adding '.dvc/config.local' to '.dvc/.gitignore'.
Adding '.dvc/updater' to '.dvc/.gitignore'.
Adding '.dvc/updater.lock' to '.dvc/.gitignore'.
Adding '.dvc/state-journal' to '.dvc/.gitignore'.
Adding '.dvc/state-wal' to '.dvc/.gitignore'.
Adding '.dvc/cache' to '.dvc/.gitignore'.

You can now commit the changes to git.

+---------------------------------------------------------------------+
|                                                                     |
|        DVC has enabled anonymous aggregate usage analytics.         |
|     Read the analytics documentation (and how to opt-out) here:     |
|              https://dvc.org/doc/user-guide/analytics               |
|                                                                     |
+---------------------------------------------------------------------+

What's next?
------------
- Check out the documentation: https://dvc.org/doc
- Get help and share ideas: https://dvc.org/chat
- Star us on GitHub: https://github.com/iterative/dvc
+ rm -rf .dvc
+ dvc init -v
ERROR: you are not inside of a dvc repository (checked up to mount point '/mnt/c')

Having any troubles?. Hit us up at https://dvc.org/support, we are always happy to help!ERROR: you are not inside of a dvc repository (checked up to mount point '/mnt/c')

Having any troubles?. Hit us up at https://dvc.org/support, we are always happy to help!

Adding '.dvc/state' to '.dvc/.gitignore'.
Adding '.dvc/lock' to '.dvc/.gitignore'.
Adding '.dvc/config.local' to '.dvc/.gitignore'.
Adding '.dvc/updater' to '.dvc/.gitignore'.
Adding '.dvc/updater.lock' to '.dvc/.gitignore'.
Adding '.dvc/state-journal' to '.dvc/.gitignore'.
Adding '.dvc/state-wal' to '.dvc/.gitignore'.
Adding '.dvc/cache' to '.dvc/.gitignore'.
DEBUG: Trying to spawn '['/usr/bin/python3', '-m', 'dvc', 'daemon', '-q', 'updater']'
DEBUG: Spawned '['/usr/bin/python3', '-m', 'dvc', 'daemon', '-q', 'updater']'
ERROR: unexpected error - [Errno 13] Permission denied: '/mnt/c/Users/villasv/Projects/myrepo/.git/objects/obj4wr1vomg' -> '/mnt/c/Users/villasv/Projects/myrepo/.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391'
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/dvc/main.py", line 37, in main
    ret = cmd.run_cmd()
  File "/usr/local/lib/python3.5/dist-packages/dvc/command/init.py", line 22, in run_cmd
    ".", no_scm=self.args.no_scm, force=self.args.force
  File "/usr/local/lib/python3.5/dist-packages/dvc/repo/__init__.py", line 116, in init
    init(root_dir=root_dir, no_scm=no_scm, force=force)
  File "/usr/local/lib/python3.5/dist-packages/dvc/repo/init.py", line 88, in init
    scm.add([config.config_file])
  File "/usr/local/lib/python3.5/dist-packages/dvc/scm/git/__init__.py", line 158, in add
    self.git.index.add(paths)
  File "/usr/local/lib/python3.5/dist-packages/git/index/base.py", line 742, in add
    entries_added.extend(self._entries_for_paths(paths, path_rewriter, fprogress, entries))
  File "/usr/local/lib/python3.5/dist-packages/git/util.py", line 72, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/git/index/util.py", line 91, in set_git_working_dir
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/git/index/base.py", line 628, in _entries_for_paths
    entries_added.append(self._store_path(filepath, fprogress))
  File "/usr/local/lib/python3.5/dist-packages/git/index/base.py", line 597, in _store_path
    istream = self.repo.odb.store(IStream(Blob.type, st.st_size, stream))
  File "/usr/local/lib/python3.5/dist-packages/gitdb/db/loose.py", line 236, in store
    rename(tmp_path, obj_path)
PermissionError: [Errno 13] Permission denied: '/mnt/c/Users/villasv/Projects/myrepo/.git/objects/obj4wr1vomg' -> '/mnt/c/Users/villasv/Projects/myrepo/.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391'
------------------------------------------------------------

Having any troubles?. Hit us up at https://dvc.org/support, we are always happy to help!
DEBUG: Analytics is enabled.
DEBUG: Trying to spawn '['/usr/bin/python3', '-m', 'dvc', 'daemon', '-q', 'analytics', '/tmp/tmpq0f9nw8i']'
DEBUG: Spawned '['/usr/bin/python3', '-m', 'dvc', 'daemon', '-q', 'analytics', '/tmp/tmpq0f9nw8i']'

@shcheklein
Copy link
Member

@villasv what happens if you run git add .dvc/config after this exception? Could you also run

ls -la for these last two git object (a link and an object):

'/mnt/c/Users/villasv/Projects/myrepo/.git/objects/obj4wr1vomg' -> '/mnt/c/Users/villasv/Projects/myrepo/.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391

Do they have a regular user/group?

Is it possible that there are some security settings on Linux (this is Linux right) that prevent access to any other process to these files except git itself? What do you think, @efiop ?

@efiop
Copy link
Contributor

efiop commented Apr 16, 2019

@shcheklein @villasv Judging by /mnt/c/Users, are you mounting your windows ntfs partition? Or maybe even NFS? If so, it is probably something like fs.protected_symlinks that is to blame. Though, regular git should probably throw the same error as well.

@villasv
Copy link
Contributor Author

villasv commented May 20, 2019

@shcheklein Nothing happens with git add, except the expected.

$ git status
On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

        new file:   .dvc/.gitignore
        new file:   .dvc/config

When I ls that link, it's not an actual link

$ ls -al .git/objects/obji_90ndko
-rwxrwxrwx 1 villasv villasv 15 May 20 13:54 .git/objects/obji_90ndko

Both files exist and have the same contents, but it's not a link.

@efiop It's a mounted NFS partition, though I'm not familiar with the details. WSL automounts C:/ inside /mnt/c.

@villasv
Copy link
Contributor Author

villasv commented May 20, 2019

I dug into gitdb and this is the exact point of failure:

            # rename onto existing doesn't work on windows
            if os.name == 'nt':
                if isfile(obj_path):
                    remove(tmp_path)
                else:
                    rename(tmp_path, obj_path)
                # end rename only if needed
            else:
                rename(tmp_path, obj_path)
            # END handle win32

It seems they're attempting to handle NTFS, but doing so with an OS check. This fails for me because I have NTFS but under Linux :-/

@shcheklein
Copy link
Member

That's sad. And looks like gitdb is not a very active project. It's not about dvc init I can imagine, right? @efiop do we have other operations that call git add internally? Does it makes sense as a workaround to create an options that disables it? Or even catch the errors and write the warning if it's not critical?

@villasv
Copy link
Contributor Author

villasv commented May 21, 2019

I'm still curious as to why this only happens in the second time I hit dvc init. I've been using DVC a lot and this is the first time I've encountered this gitdb bug, so working around this scenario could be enough.

@efiop
Copy link
Contributor

efiop commented May 21, 2019

@shcheklein

And looks like gitdb is not a very active project

It is just stable 🙂

do we have other operations that call git add internally? Does it makes sense as a workaround to create an options that disables it? Or even catch the errors and write the warning if it's not critical?

Currently we don't. If the issue is in that particular spot, it seems feasible to just fix it. Or, we could try dulwich instead, but it has problems of its own. I'm not quite sure if that error is critical or not, I'm not that familiar with the internals. Of course, we could add something like --no-add option, but my first choice would be to go ahead and fix it in the gitdb.

@shcheklein
Copy link
Member

We would need to support our own branch for gitdb which is not nice at all.

How about we just ignore this error? adding files to git is not critical at all for us and it's an easiest workaround for now?

Re the option - I was thinking more about a config option, not a CLI option.

@efiop
Copy link
Contributor

efiop commented May 21, 2019

@shcheklein

We would need to support our own branch for gitdb which is not nice at all.

We could temporarily use our branch as a dependency, until the fix is officially released yes, but it is not like the project is completely dead.

How about we just ignore this error? adding files to git is not critical at all for us and it's an easiest workaround for now?

Sure.

@efiop efiop added bug Did we break something? help wanted p4-not-important labels Jul 23, 2019
@efiop efiop added p3-nice-to-have It should be done this or next sprint and removed p4 labels Sep 25, 2019
efiop added a commit to efiop/gitdb that referenced this issue Sep 25, 2019
Our user was experiencing issue [1] when using a git repository on NTFS mount running on Linux.
The current check checks if we are running on Windows, but it should really check if we are
on NTFS. That check is not trivial, so it is simpler and better to just always apply NTFS-specific logic.

[1] iterative/dvc#1880 (comment)
efiop added a commit to efiop/gitdb that referenced this issue Sep 25, 2019
Our user was experiencing issue [1] when using a git repository on NTFS mount
running on Linux. The current check checks if we are running on Windows, but
it should really check if we are on NTFS. And since checking fs type is not
that trivial and not efficient, it is simpler and better to just always apply
NTFS-specific logic, since it works on other filesystems as well.

[1] iterative/dvc#1880 (comment)
efiop added a commit to efiop/gitdb that referenced this issue Sep 25, 2019
Our user was experiencing issue [1] when using a git repository on NTFS mount
running on Linux. The current check checks if we are running on Windows, but
it should really check if we are on NTFS. And since checking fs type is not
that trivial and not efficient, it is simpler and better to just always apply
NTFS-specific logic, since it works on other filesystems as well.

[1] iterative/dvc#1880 (comment)
@efiop
Copy link
Contributor

efiop commented Sep 25, 2019

@villasv Sorry for such a huge delay. I've pushed a fix for gitdb to my branch, could you install it after dvc and give it a try, please? E.g. pip uninstall git+https://github.com/efiop/gitdb.

@iterative iterative deleted a comment from efiop Sep 26, 2019
@jorgeorpinel jorgeorpinel changed the title Running dvc init twice on the same repository fails init: fails on 2nd run with NTFS on Linux (gitdb lib bug) Sep 26, 2019
@jorgeorpinel
Copy link
Contributor

Hi. So was the partition NTFS or NFS? Has this issue been reported to gitdb? Thanks

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@jorgeorpinel NTFS and yes gitpython-developers/gitdb#52

@villasv
Copy link
Contributor Author

villasv commented Sep 27, 2019

pip install --force-reinstall git+https://github.com/efiop/gitdb
./repro.sh
+ rm -rf myrepo
+ mkdir myrepo
+ cd myrepo
+ git init
Initialized empty Git repository in /mnt/c/Users/villasv/Projects/myrepo/.git/
+ dvc init
Adding '.dvc/config.local' to '.dvc/.gitignore'.
Adding '.dvc/updater' to '.dvc/.gitignore'.
Adding '.dvc/state-journal' to '.dvc/.gitignore'.
Adding '.dvc/state-wal' to '.dvc/.gitignore'.
Adding '.dvc/state' to '.dvc/.gitignore'.
Adding '.dvc/lock' to '.dvc/.gitignore'.
Adding '.dvc/tmp' to '.dvc/.gitignore'.
Adding '.dvc/updater.lock' to '.dvc/.gitignore'.
Adding '.dvc/cache' to '.dvc/.gitignore'.

You can now commit the changes to git.

+---------------------------------------------------------------------+
|                                                                     |
|        DVC has enabled anonymous aggregate usage analytics.         |
|     Read the analytics documentation (and how to opt-out) here:     |
|              https://dvc.org/doc/user-guide/analytics               |
|                                                                     |
+---------------------------------------------------------------------+

What's next?
------------
- Check out the documentation: https://dvc.org/doc
- Get help and share ideas: https://dvc.org/chat
- Star us on GitHub: https://github.com/iterative/dvc
+ rm -rf .dvc
+ git reset
+ dvc init -v
Adding '.dvc/config.local' to '.dvc/.gitignore'.
Adding '.dvc/updater' to '.dvc/.gitignore'.
Adding '.dvc/state-journal' to '.dvc/.gitignore'.
Adding '.dvc/state-wal' to '.dvc/.gitignore'.
Adding '.dvc/state' to '.dvc/.gitignore'.
Adding '.dvc/lock' to '.dvc/.gitignore'.
Adding '.dvc/tmp' to '.dvc/.gitignore'.
Adding '.dvc/updater.lock' to '.dvc/.gitignore'.
Adding '.dvc/cache' to '.dvc/.gitignore'.

You can now commit the changes to git.

DEBUG: Analytics is enabled.
+---------------------------------------------------------------------+
|                                                                     |
|        DVC has enabled anonymous aggregate usage analytics.         |
|     Read the analytics documentation (and how to opt-out) here:     |
|              https://dvc.org/doc/user-guide/analytics               |
|                                                                     |
+---------------------------------------------------------------------+

What's next?
------------
- Check out the documentation: https://dvc.org/doc
- Get help and share ideas: https://dvc.org/chat
- Star us on GitHub: https://github.com/iterative/dvc

It seems it works as expected! 🎉

@efiop efiop self-assigned this Sep 27, 2019
@ghost
Copy link

ghost commented Sep 27, 2019

@efiop , did you forget to close this one or is there still something to be done?

@efiop
Copy link
Contributor

efiop commented Sep 28, 2019

@MrOutis Waiting for gitpython-developers/gitdb#52 to get merged.

@efiop
Copy link
Contributor

efiop commented Sep 29, 2019

For the record, gitpython-developers/gitdb#52 got merged, now waiting for the new gitdb release, so we could update our requirements.

@efiop efiop closed this as completed in 449f43f Oct 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? help wanted p3-nice-to-have It should be done this or next sprint
Projects
None yet
Development

No branches or pull requests

4 participants