| ▲ | Overfitted a 900KB Transformer to Compress a 100MB CSV into 7MB | ||||||||||||||||||||||||||||||||||
| 8 points by spidy__ 3 days ago | 10 comments | |||||||||||||||||||||||||||||||||||
I built an experiment that uses an overfitted transformer and arithmetic coding to compress individual files. Instead of training the model to generalize, I train a 900KB transformer to memorize a single file and predict the next byte. Those predictions are fed into an arithmetic coder to produce the compressed output. On a 100MB NYC taxi CSV, it compresses to about 7MB (~0.5 bits/byte). On a 100MB slice of enwik9, it compresses to about 21MB (~1.68 bits/byte). It's pretty slow right now (roughly 20–30 minutes of training and 45 minutes each for compression and decompression on my AMD 7800XT). Checkout the repo - https://github.com/samyak112/pym-particles | |||||||||||||||||||||||||||||||||||
| ▲ | tae0086 a day ago | parent | next [-] | ||||||||||||||||||||||||||||||||||
Neat approach. Since the 900KB model ships with the compressed file, is there a file size below which the model overhead just eats the gains? Curious where the crossover is. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | 7373737373 2 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
What does it compress the full 1GB file to? http://prize.hutter1.net/ | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | purple-leafy 2 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||
That’s so awesome! I want to try something similar. I’ve been going crazy with compression work. I reckon I can beat that prize link | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||