2019-01-27 by Vladimir Schneider

Migrating a Project Repository to git-lfs

I tried using git lfs two years ago but it only messed up my repository when I used it from IntelliJ and I gave up. There was no migrate command and use of lfs from OS X applications required setting up environment variables for OS X applications which turned out to be problematic.

It was only for the last Markdown Navigator release that I hit the 100MB file size limit on the PhotoShop file for screenshots I used documentation. I was suddenly no longer able to push commits to GitHub.

This time around I played safe and did not trust the tools to work. The first thing I did was create a complete copy of the project directory with all files for backup of latest changes.

Local revisions which failed to push to the server had to be delete so the changes can be applied after the lfs migration. I used Tower 3 for this but you can use git on the command line to delete latest revisions which were not pushed to GitHub, (see).

I also deleted all unnecessary branches from the local repository to reduce the migration time. In my case this left only the master branch.

  1. Install git-lfs by following the official instructions.

  2. Install the git lfs in the repository:

    git lfs install
    
  3. Configure which files should be under lfs for the repository. In my case I wanted to move *.psd, *.zip, *.jar, *.pdf and *.ai files to LFS so I don't need to go through the exercise later.

    git lfs track "*.zip,*.jar,*.pdf,*.ai"
    
  4. Now the fun begins. Using the lfs migrate info to see which file types took the most space failed:

    ⎩19:38:34 $ git lfs migrate info
    migrate: Fetching remote refs: ..., done                                                                                            
    Error in git rev-list --stdin --reverse --topo-order --do-walk --: exit status 128 fatal: bad revision '^refs/remotes/gitlab/0.1'   
    
    migrate: Sorting commits: ..., done
    

    Long search online only brought up git-lfs issues which claimed that this is already resolved did not help. In the end what worked was:

    git lfs migrate info --everything
    

    This only confirmed that the first 3 file extensions took the bulk of space. I was not able to migrate all 5 extensions in one command because it would fail with Too many open files error.

    Splitting it up into two sets worked fine:

    git lfs migrate import --include-ref=master --include="*.psd,*.zip,*.jar"
    
    git lfs migrate import --include-ref=master --include="*.pdf,*.ai"
    

Now the repository was migrated but the fun was just starting. If you open the repo with a GUI app you will see a brand new branch which will have no revisions in common with the original master branch. History was rewritten completely. Tags were moved to the new rewritten revisions. You will see that it is X000 commits ahead and behind origin/master.

I wanted the new lfs migrated branch to be master with the old branch renamed to master-nolfs left for a while until I feel comfortable deleting it.

  1. To do this will require that you rename the local master to master-lfs" and untrack origin/master branch:

    git branch -m "master-lfs" 
    git branch --unset-upstream
    
  2. Push the new branch to origin master-lfs to create a new branch. This you can do from a GUI git client or the command line.

    git push origin master-lfs
    
  3. GitHub will not allow renaming of the default branch so another branch will need to be selected as default. On GitHub repository page go to Settings/Branches and select another branch as default.

  4. Rename the remote master to master-nolfs. I used Tower 3 to rename the remote branch.

  5. On GitHub, change default branch to master-nolfs, to allow master-lfs to be renamed.

  6. Rename the remote master-lfs to master.

  7. On GitHub, change default branch to master.

  8. Rename the local master-lfs to master and set to track remote master

If like me, you think your are done, you will be wrong. lfs migrate has nicely migrated all the tags to the new revisions. Meanwhile, GitHub has these same tags associated with the master-nolfs branch revisions. So trying to push tags to the master branch on GitHub will fail.

There is no way to delete tags on GitHub without deleting the local tags and pushing the deleted tags, one at a time to GitHub. However, that will delete your local tags, which is not what I wanted.

  1. The solution I meandered into was to dump all refs with their revisions to a file by using:

    git show-ref >ref-list
    

    This gives you something like:

    cd88cf7ceee9f434bbfd0b2b38ddda916c3a5200 refs/heads/master
    cd88cf7ceee9f434bbfd0b2b38ddda916c3a5200 refs/remotes/origin/HEAD
    cd88cf7ceee9f434bbfd0b2b38ddda916c3a5200 refs/remotes/origin/master
    f5c079482766e337d96104533492e2564c70a8b4 refs/remotes/origin/master-nolfs
    450ded352a39cf001ad0933ccac6012f86018455 refs/tags/0.2
    ab16eb5fc4020e3093a57d6cf0c914dca95210de refs/tags/0.3
    dd9036388dad891d177e9d8ab918e04a1e63b299 refs/tags/0.4
    53d4b090a64c3675cbfb93fcbd0842a2ce769c59 refs/tags/0.5
    6b394571a558bf52944081b77076fbfafad1d227 refs/tags/0.5.1
    

    Which needs to be transformed into 3 batch files to: delete tags, push tags to GitHub, re-create tags. All batch files need to keep only the refs/tags/ line information.

    delete-tags.sh:

    #!/usr/bin/env bash
    git tag -d 0.2
    git tag -d 0.3
    git tag -d 0.4
    git tag -d 0.5
    git tag -d 0.5.1
    

    push-tags.sh:

    #!/usr/bin/env bash
    git push origin :refs/tags/0.2
    git push origin :refs/tags/0.3
    git push origin :refs/tags/0.4
    git push origin :refs/tags/0.5
    git push origin :refs/tags/0.5.1
    

    create-tags.sh, the revision in this file is the first 7 characters of the revision from the ref-list file:

    #!/usr/bin/env bash
    git tag -a -m 0.2 0.2 450ded3
    git tag -a -m 0.3 0.3 ab16eb5
    git tag -a -m 0.4 0.4 dd90363
    git tag -a -m 0.5 0.5 53d4b09
    git tag -a -m 0.5.1 0.5.1 6b39457
    

Any way you feel comfortable transforming the input to desired output is fine. I used Sublime Text multi-caret editing but IntelliJ IDE can be used just as well.

Now run the batch files in order and be patient. Pushing deleted tags to remote takes a while:

  1. delete-tags.sh
  2. push-tags.sh
  3. create-tags.sh

You can double check that the tags were created correctly by dumping out the refs list and comparing it to one used for creating the batch files. The two should be identical.

  1. The local tags can be pushed to the remote using the git command line:

    git push --tags
    
  2. Not done yet, as I discovered when trying to apply changes from the copy of my project directory with the latest changes, using BeyondCompare 4. All the lfs tracked files are hundred or so bytes because they are references for the actual files.

    lfs checkout needs to be performed. Not being sure how to do it from the command line (read: having tried and failed) I used SourceTree git client which detects this condition and offers to perform the initial check out for you.

After this step the local repository is in a state ready to have latest changes applied with usual work procedure restored.

Overall, it was not the best way to spend my Sunday but at least git-lfs migration is out of the way.

I wish you a smooth git-lfs migration. If you have a better way of doing it I would be glad to hear it. I have a few projects to migrate and all this git meat-grinding is not something I am looking forward to.

 



Constructive comments and suggestions are welcome. I can also be reached me at vladimir@vladsch.com