Remix.run Logo
anamexis a year ago

The filenames contain information that you need in some way for the scheme to work.

You are combining different parts and inserting a missing byte every time you combine the files. You need to combine the parts in the correct order, and the order is part of the information that makes this work.

If the ordering isn't coming from filenames, it needs to come from somewhere else.

mhandley a year ago | parent [-]

You could do the same spitting trick but only split at progressively increasing file lengths at the character '5'. The "compression" would be worse, so you'd need a larger starting file, but you could still satisfy the requirements this way and be independent of the filenames. The decompressor would just sort the files by increasing length before merging.

gus_massa a year ago | parent | next [-]

Nice idea, but doesn't this require a linear increase of the length of the partial files and a quadratic size of the original file?

If the length of a file is X, then in the next file you must skip the first X characters and look for a "5" that in average is in the X+128 position. So the average length of the Nth file is 128*N and if you want to reduce C bytes the size of the original file should be ~128C^2/2 (instead of the linear 128*C in the article).

Dylan16807 a year ago | parent | next [-]

It does, but that's fine. He only needs to save 150-200 bytes and the file is 3MB. 128*200²/2 is 2.5MB Though I think it would be 256* here? Still, it can be made to work.

mhandley a year ago | parent | prev [-]

Yes, I think it is quadratic. I don't claim it's practical (the original isn't practical either though), but just that the dependency on filenames isn't fundamental.

gus_massa a year ago | parent [-]

> I don't claim it's practical

Don"t interpret my comment as a complain, I was just triying to understand.

This is reposted from time to time, but I don't remember someome proposing this trick before. It's a nice idea.

anamexis a year ago | parent | prev [-]

That's a neat idea.