Skip to content

BadObject on repo after git.gc() #61

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nova77 opened this issue Jun 1, 2012 · 6 comments
Closed

BadObject on repo after git.gc() #61

nova77 opened this issue Jun 1, 2012 · 6 comments

Comments

@nova77
Copy link

nova77 commented Jun 1, 2012

Example code:

>>> r = Repo.init('/tmp/foo/', bare=False)
>>> print >> open('/tmp/foo/bar.txt', 'w'), 'some content'
>>> r.index.add(['bar.txt'])
>>> r.index.commit('commit bar.txt')
>>> print r.git.status() # all clear?
# On branch master
nothing to commit (working directory clean)
>>> r.git.gc()
''
>>> r.iter_commits() # kabooom!
...
BadObject: 39154e2f45f96bdbbb86aa4f1ffc72f224b88f31

Tested on GitPython 0.3.2.RC1

@Byron
Copy link
Member

Byron commented Jun 7, 2012

It seems the repo didn't update its internal state, probably its a performance implementation of the (default) python backend.
Alternatively, you might try the GitCmdDB backend, it should cope with the issue, as a workaround.

In any way, this is a bug that one might want to address indeed.

@Byron
Copy link
Member

Byron commented Jun 7, 2012

With your current backend, you can also call

repo.odb.update_cache()

to read in the newly created packs.

@nova77
Copy link
Author

nova77 commented Jun 7, 2012

Ah, that workaround is good enough for me, thanks :)

@cboylan
Copy link

cboylan commented Sep 11, 2012

The repo.odb.update_cache() fix appears to only work if there were pack files present when the repo object was created. If only loose files were present when the repo object was created that repo object doesn't seem to pick up the new pack files when update_cache() is called.

@Byron
Copy link
Member

Byron commented Sep 11, 2012

I acknowledge that. Apparently on initialization, it will add handlers for whichever storage type it finds, like loose objects, packs, or alternate info files.
If a storage type is not available, it will never look for it again. The only workaround is to recreate the repository instance.

@Byron
Copy link
Member

Byron commented Jan 8, 2015

I have to consider this an issue that can't be fixed automatically, as the only solution would be to watch all commands issued through the git command generator instance. However, as it doesn't have access to it's parent repository, it can't watch for commands like gc and automatically update caches.

Alternatives are as follows:

  • call repo.odb.update_cache() yourself after such a call
  • initialise your repository with an ODB implementation that doesn't cache, like the GitCmdObjectDB, such as in git.Repo('.', odbt=git.GitCmdObjectDB). The latter DB uses the git command to stream objects, which could also be faster depending on the task. Probably it would also cause less file handles to be used (see Merged fd leaks fix from the master to the 0.3 version #150)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants