Remix.run Logo
kittoes 2 days ago

What if we didn't even attempt to compress the file? Theoretically, there is a random number generator + seed combination that would output the same bytes as the source file.

changoplatanero 2 days ago | parent [-]

Neat idea but chances are the length of the seed is equal to the length of the original file.

guepe 2 days ago | parent | next [-]

There is a polynomial expression that will match the file. I wonder if expressing it would be compressible to a lower size than original file? [edit] I’m wrong - not all sequences can be expressed with a polynomial.

l33t7332273 2 days ago | parent [-]

This reminds me of a data compression scheme I came up with once:

Treat an n bit file as a polynomial over the finite field with characteristic 2. Now, there are some irreducible polynomials in this field, but many polynomials have factors of x and of (x+1). Factor the polynomial into P(x)x^n (x+1)^m. and just collect these terms, storing only P, n, and m.

crazygringo 20 hours ago | parent | prev [-]

That's not how seeds work. Seeds are tiny.

Actually this would work perfectly if you knew it was generated in a single pass by a known random number generator and you had tons of time to brute force it.

If the file were generated by a natural source of entropy then forget it.

Or even if modified in a trivial way like adding 1 to every byte.

crazygringo 15 hours ago | parent [-]

What is with the many downvotes but no comments? Everything I said is factual.

Seeds are something like 32 bits, though it depends on the exact implementation. But not the length of files.

iforgotpassword 10 hours ago | parent [-]

When implementing a PRNG, you can make its seed as big as you want. There is no mathematical law that dictates or limits the size of a seed.

gus_massa 5 hours ago | parent | next [-]

But I assume the GGP assumes that the author is lazy and used a public available PRNG instead of a custom made. (A long time ago someone broke the login security check in HN using a trick like that. Obviously, it's already fixed.)

crazygringo 5 hours ago | parent | prev [-]

I mean sure you could in theory, but in practice that's not how common built-in random number generators work.

I was responding to:

> chances are the length of the seed is equal to the length of the original file

And why would the chances be that? You'd really have to go out of your way for that. I don't even know if there are libraries that can handle a seed and state length on the scale of megabytes. No, chances are 99.99+% it used a seed of a few bytes, because that's how common random number generators designed for efficiency work.