|
| ▲ | Dylan16807 16 hours ago | parent | next [-] |
| > such that the files sort differently But if you change them without making them sort differently, everything is fine. He depends on the order, not the filenames. You could even remove the filenames entirely, as long as you patch the code to account for such a strange environment. |
| |
| ▲ | mafuy 5 hours ago | parent [-] | | Not really a good point. If the order of bytes does not matter, then I can compress any file of your liking to O(log n) size :P |
|
|
| ▲ | hombre_fatal 16 hours ago | parent | prev [-] |
| Not in any significant way. The decompressor could be changed to require you to feed the files into it in the correct order or expect some other sorting. What you're saying is like saying that you encoded info in filenames because decompress.sh expects a file "compressed.dat" to exist. It's not describing any meaningful part of the scheme. |
| |
| ▲ | anamexis 16 hours ago | parent [-] | | The filenames contain information that you need in some way for the scheme to work. You are combining different parts and inserting a missing byte every time you combine the files. You need to combine the parts in the correct order, and the order is part of the information that makes this work. If the ordering isn't coming from filenames, it needs to come from somewhere else. | | |
| ▲ | mhandley 16 hours ago | parent [-] | | You could do the same spitting trick but only split at progressively increasing file lengths at the character '5'. The "compression" would be worse, so you'd need a larger starting file, but you could still satisfy the requirements this way and be independent of the filenames. The decompressor would just sort the files by increasing length before merging. | | |
| ▲ | gus_massa 5 hours ago | parent | next [-] | | Nice idea, but doesn't this require a linear increase of the length of the partial files and a quadratic size of the original file? If the length of a file is X, then in the next file you must skip the first X characters and look for a "5" that in average is in the X+128 position. So the average length of the Nth file is 128*N and if you want to reduce C bytes the size of the original file should be ~128C^2/2 (instead of the linear 128*C in the article). | | |
| ▲ | mhandley 30 minutes ago | parent [-] | | Yes, I think it is quadratic. I don't claim it's practical (the original isn't practical either though), but just that the dependency on filenames isn't fundamental. |
| |
| ▲ | anamexis 15 hours ago | parent | prev [-] | | That's a neat idea. |
|
|
|