| ▲ | calebkaiser 5 hours ago | |||||||||||||||||||||||||
Nah, optical compression is a thing. You see it in a lot of different areas in ML. In this case, the "trick" has been known for a while, and belongs to a whole world of compression research. But I think where you're maybe getting mixed up is in where that 60% gain is coming from. It's not a 60% percent reduction in cost for 100% of the same output. If you have a model and input text A, and you fix the seed etc. and run Text A through the model as text tokens and as compressed image tokens, you will not get identical outputs. You're specifically reducing the number of tensors needed to represent your input, which saves you on raw compute, but also by definition gives you less room to represent the information in your input. It's lossy, in other words. Put another way, if you're using a model like Fable because you need the absolute frontier of capability and cheaper models cannot solve your tasks, then there is a very real chance that a compression strategy like this drops Fable's accuracy such that it's no longer suitable for your task. Which defeats the point of you paying for the most expensive model in the first place. So, it's cool research. Might be useful for some people. Probably isn't something that has incredible utility in real use cases. | ||||||||||||||||||||||||||
| ▲ | rightbyte 5 hours ago | parent [-] | |||||||||||||||||||||||||
> a compression strategy To me compression implies smaller size? However new line chars seems to be removed in the pic so I guess it could be expressed in fewer bytes than the original text with further compression ... | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||