▲ | tempay 5 days ago | |
> binary chunking and deduplication Are there many binaries that people would store in git where this would actually help? I assume most files end up with compression or some other form of randomization between revisions making deduplication futile. | ||
▲ | adastra22 5 days ago | parent | next [-] | |
A lot in the game and visual art industries. | ||
▲ | digikata 5 days ago | parent | prev | next [-] | |
I don't know, it's all probability in the dataset that makes one optimization strategy better over another. Git annex iirc does file level dedupe. That would take care of most of the problem if you're storing binaries that are compressed or encrypted. It's a lot of work to go beyond that, and probably one reason no one has bothered with git yet. But borg and restic both do chunked dedupe I think. | ||
▲ | zigzag312 5 days ago | parent | prev | next [-] | |
2-3x reduction in repository size compared to Git LFS in this test: https://xethub.com/blog/benchmarking-the-modern-development-... | ||
▲ | hinkley 5 days ago | parent | prev [-] | |
It would likely require more tooling. |