▲ | SideQuark 15 hours ago | |
Your idea fails pretty easily. Simply do the math, and you’ll find your idea won’t work. One argument: if your method worked, then we could compress nearly any file. This violates the pigeonhole principle. Another example: for a 4GB file you’d need roughly a 4 byte integer to specify where the repeat started and where the second was, for 8 bytes. Then you need a length byte. Now you need a repeated 10+ bytes sequence to make this compress anything. There are 256^10=2^80 possible such 10 bytes sequence sequences, and only ~2^32 of them in a 4 gb file. So the odds of a repeated are around 1 in 2^50. Tweak methods and estimate as you wish. It fails. |