Giter Club home page Giter Club logo

Comments (12)

newren avatar newren commented on July 18, 2024 1

Very interesting question. It should be possible to do this kind of transition, you'd just need a way to get the necessary data out of LFS.

I actually thought this would make for a good example under contrib/ some months back and started work on it in the 'pu' branch (though I was focusing more on conversion to lfs at first), but got frustrated reading through the git-lfs api docs; some things were documented well but there were some holes. Besides, I hadn't ever used LFS myself and lacked motivation to fully learn it, so I simply punted.

But I can certainly provide a few pointers.

If you or someone else wants to do an lfs-conversion, the contrib/filter-repo-demos/insert-beginning script shows an example of how to add extra files into a commit (in particular, appending to commit.file_changes). You'd need to modify it to change the checks based on number of parents to instead check whether that commit contained any new LFS files that the (first) parent commit did not have, and if so, add it. You may also need to modify existing control files such as .gitattributes (the contrib/filter-repo-demos/lint-history might be helpful as an example of inserting a new Blob into the stream and changing an existing change.blob_id to use it), and you may need to delete other control files such as .lfsconfig (for which contrib/filter-repo-demos/clean-ignore may be a helpful example, at least the bits showing how to strip something out of commit.file_changes).

As for how to get the data out of LFS, though, that I can't help with a whole lot. There are links to git-lfs API documentation in the 'pu' branch which might be helpful.

If you come up with something that works and are willing to share, it'd be awesome to add something to contrib -- even if it's not general, only does one side of the lfs conversion, etc.

from git-filter-repo.

benblo avatar benblo commented on July 18, 2024 1

Turns out my assumption was wrong: git lfs migrate export --everything --include="*" does rewrite the whole history, across all branches, reinjecting all the large files' consecutive versions (see here). Awesome!
Thanks for the info anyway! So far I'm super impressed by filter-repo's speed, I'm pondering if it could be used to replace git-subtree (which for my use is really lacking).

from git-filter-repo.

lstrojny avatar lstrojny commented on July 18, 2024 1

I needed to import a quite big repository (500K commits) into lfs and git lfs import was way too slow. bfg on the other hand was very fast but has limitations on matching (e.g. it cannot match paths) so I've looked into how to do it with git-filter-repo. Here is a working version: https://gist.github.com/lstrojny/6d29aea45179668725f43650fa46c4e7
It takes ~5 minutes for 500K commits, while git lfs import would have taken hours.

Please note that my Python sucks, it’s probably way too complicated, it exactly works for my usecase and it assumes that there is a .gitattributes file in the root of the source repository. Nevertheless I hope this will save somebody some time and does give an idea how to get started.

from git-filter-repo.

rconde01 avatar rconde01 commented on July 18, 2024 1

Here's a script based on @lstrojny which:

  • imports based on file size above a threshold
  • edits/creates the .gitattributes file to include migrated entries

https://gist.github.com/rconde01/ab93a0edddc5b0abf64ad4c8ac5b6ade

Unfortunately there's a bug. The .gitattributes file in my repository doesn't exist at the start. So the first N changes are appending a file change to the commits where LFS migrations are introduced. Then .gitattributes is introduced in the history and then the commit is edited. It is all fine until this point, but edits to .gitattributes for future LFS migrations are lost. In a small test repository with the same structure, it works fine :( Unfortunately I can't share my real repo.

@newren Do you see anything wrong in my script? I think this is getting closer to something you could deliver (and is about 1000x faster than the official migrator).

from git-filter-repo.

newren avatar newren commented on July 18, 2024

Cool, glad you found a solution to handle your case of migrating out of LFS. I suspect filter-repo could still make things better (e.g. does git lfs update referenced sha1sums in commit messages), and rolfb is interested in the case of migrating into LFS using filter-repo, so I'll leave this ticket open so people can see my above pointers about how writing an lfs-conversion script based on filter-repo would work.

from git-filter-repo.

benblo avatar benblo commented on July 18, 2024

Yeah, migrate export solved my immediate issue so I moved on, this repo is such a mess that preserving sha1 in commit messages is the least of my problems :) ! I have repos that filter-repo could help solve some of those issues though, so I may be back with more questions in a few days.

from git-filter-repo.

newren avatar newren commented on July 18, 2024

Actually, I think I'll change my mind and close this one out just to keep the issue list tidy. I've got it marked with the contrib-candidate label though to help me and others find it.

from git-filter-repo.

ymartin59 avatar ymartin59 commented on July 18, 2024

@newren I propose to re-open this issue according to the following use-case: I expect to migrate a Git repository with notes on commits to LFS and append ".lfsconfig" in all commits in a single execution (and rewrite commit hash reference in notes too)

from git-filter-repo.

klinki avatar klinki commented on July 18, 2024

@newren Hello, I created script to do LFS migration. Unfortunately there are some manual steps, but it worked well enough for me.

Here is gist: https://gist.github.com/klinki/3a314ab3e7ab680d16b5e7eb256cafbd

Currently it is just an example and it would require a lot of polishing (and automating some manual steps). But it is good enough as a starter.

from git-filter-repo.

rconde01 avatar rconde01 commented on July 18, 2024

I found multiple issues with the script - I'll update when i have the fixes.

from git-filter-repo.

oryandunn avatar oryandunn commented on July 18, 2024

@rconde01 I tried using your script (which I think was updated on or after July 11th), and like your repo, mine does not have a .gitattributes (not at the start nor any time), and while the script runs and seems to properly LFS all the files it should, and I get printouts with "Added change to .gitattributes to track additional LFS files.", .gitattributes doesn't ever seem to be created in the repo. Do you have any idea what's going wrong? I've looked over your script, and nothing jumped out at me. In my case, I could probably just get by with git lfs migrate, but I'd really like the replace refs to be generated, and hence why I wanted to use git-filter-repo.

Edit: well, I thought I saw somewhere the file was updated, but now I think it's your original from July 4th. Do you have those fixes for those issues you found?

from git-filter-repo.

exaexa avatar exaexa commented on July 18, 2024

Hi all,

just a note about the use of git lfs migrate vs git filter-repo -- I found that git lfs migrate export for some reason rewrites the whole history, even commits that do not have to be rewritten because they have no LFS in the whole history.

Obviously this makes the migration basically impossible if there are other remotes merged into the history etc. that we don't want to (or cannot) rewrite. I guess this might be a great usecase for filter-repo but I have no idea how to implement this now; any documentation in that regard would be very welcome.

from git-filter-repo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.