Page 1 of 1

Introduce git LFS ?

Posted: Sat Apr 11, 2020 1:05 pm
by mrmclovin
What's your opinion of introducing the Git Large File Storage https://git-lfs.github.com/ ?

It will help minimize repo size and improve performance when working with git since git then won't have to track chunks of large media files. I think Github have support for it as well which should make it really easy to integrate.

It could be really beneficial for the documentation manual as well since adding images, tutorial media files etc could be really easy and under version control while being really cheap and not clutter the repo with heavy weights.

There will probably be an increase of tooling dependencies as a developer would need to have the extension installed I think.

WDYT?

Re: Introduce git LFS ?

Posted: Sat Apr 11, 2020 6:36 pm
by dark_sylinc
I have experience with Git LFS, which is why I'd be against it for the Ogre project:
  1. Ogre doesn't have many big files (e.g. above 0.5-1 MB) and for those files, we rarely change or delete them. LFS makes sense when you're modifying a 25MB file in multiple commits; thus when you clone a project from scratch you end up downloading ~20MB x number_commits_modifying_that_file. But with LFS you only download 25 MB because you download the latest version. However if you push a file in one commit and never ever change it again (or remove it), LFS is pointless
  2. Documentation has many small-sized data files instead of big sized files which reduces the effectiveness of LFS.
  3. In the specific case of documentation (Github Pages / gh-pages branch), when we get a big change (e.g. because Doxygen was updated and changed something everywhere), instead of perfoming another commit we force-push a single commit that replaces everything, which is similar to what LFS does (only download the latest version, as we get rid of the older versions).
  4. In the case of ogre-next it doesn't work with hg-git
  5. It introduces a few ocasional technical issues:
    1. Files not being committed as LFS in a misconfigured client, thus appear always dirty until they're recommited as LFS, but now the file has already been added to the repo and thus compltely fixing it would mean rewriting history/push force all commits since that file was introduced as non-LFS. If you don't rewrite history and for some reason you checkout older commits where this glitch happened, abandoning detached head state is a PITA because git always thinks the repo is dirty even after reset --hard and refuse to continue because 'your changes will be overwritten' by the checkout. This also makes git bisect a billion times harder to use.
    2. This affects advanced users and ogre devs: Pulling from multiple repo clones (e.g. build farms, multiple branches) always requires an active internet connection because LFS will try to redownload the LFS files from server even though you already have them locally. You can switch to dirty things like copying the .git folder but that's error prone (e.g. deleting the wrong .git folder) or not viable (the repos are forks). If the repo is private, then credentials must be set on all machines.
In the case of samples, bigger, better-looking projects that need big asset files (e.g. many 4096x4096 textures, multi-megabyte mesh files) it makes sense to use LFS. But for that case it may be better to analyze the idea of moving those samples into another git repo (which would use LFS) and use git submodules to link them to the main Ogre repo.

Thus IMO LFS would bring little benefit for us (considering our use cases, historically) while adding some annoyances which won't be worth it.

Cheers
Matias

Re: Introduce git LFS ?

Posted: Sat Apr 11, 2020 8:45 pm
by mrmclovin
dark_sylinc wrote:
Sat Apr 11, 2020 6:36 pm
I have experience with Git LFS, which is why I'd be against it for the Ogre project:
  1. Ogre doesn't have many big files (e.g. above 0.5-1 MB) and for those files, we rarely change or delete them. LFS makes sense when you're modifying a 25MB file in multiple commits; thus when you clone a project from scratch you end up downloading ~20MB x number_commits_modifying_that_file. But with LFS you only download 25 MB because you download the latest version. However if you push a file in one commit and never ever change it again (or remove it), LFS is pointless
  2. Documentation has many small-sized data files instead of big sized files which reduces the effectiveness of LFS.
  3. In the specific case of documentation (Github Pages / gh-pages branch), when we get a big change (e.g. because Doxygen was updated and changed something everywhere), instead of perfoming another commit we force-push a single commit that replaces everything, which is similar to what LFS does (only download the latest version, as we get rid of the older versions).
  4. In the case of ogre-next it doesn't work with hg-git
  5. It introduces a few ocasional technical issues:
    1. Files not being committed as LFS in a misconfigured client, thus appear always dirty until they're recommited as LFS, but now the file has already been added to the repo and thus compltely fixing it would mean rewriting history/push force all commits since that file was introduced as non-LFS. If you don't rewrite history and for some reason you checkout older commits where this glitch happened, abandoning detached head state is a PITA because git always thinks the repo is dirty even after reset --hard and refuse to continue because 'your changes will be overwritten' by the checkout. This also makes git bisect a billion times harder to use.
    2. This affects advanced users and ogre devs: Pulling from multiple repo clones (e.g. build farms, multiple branches) always requires an active internet connection because LFS will try to redownload the LFS files from server even though you already have them locally. You can switch to dirty things like copying the .git folder but that's error prone (e.g. deleting the wrong .git folder) or not viable (the repos are forks). If the repo is private, then credentials must be set on all machines.
In the case of samples, bigger, better-looking projects that need big asset files (e.g. many 4096x4096 textures, multi-megabyte mesh files) it makes sense to use LFS. But for that case it may be better to analyze the idea of moving those samples into another git repo (which would use LFS) and use git submodules to link them to the main Ogre repo.

Thus IMO LFS would bring little benefit for us (considering our use cases, historically) while adding some annoyances which won't be worth it.

Cheers
Matias
Thank you for very good points. I wasn't really thinking through that the bigger files didn't change a lot.

Given the arguments, I agree and would not think this is a big issue now :P