If the trick were genuinely useful, and was well circulated months ago, the resource-starved inference providers would have squeezed this trick dry already, instead of wasting 60% of their tokens, waiting for users to implement it themselves in 5 minutes of effort.

▲

Klathmon 3 hours ago | parent | next [-]

That's like saying quantization isn't real because the frontier labs aren't using it in their production inference.

This is a lossy process, it produces worse results. It might be worth it for some situations, but applying it to everything would just be making your SOTA model worse

▲

ptx 2 hours ago | parent [-]

Isn't this just quantization with extra steps? Can converting the text to an image really be a better way to lossily compress it? (Not that I have any idea what I'm talking about on this topic.)

	▲	Klathmon an hour ago \| parent [-]
		I also have no idea what I'm talking about, but to me this seems closer to the "caveman mode" that some people use to compress info into fewer tokens. Going through the image tokenizer allows you to leave the source text untouched while still gaining (some of?) the benefits

▲

solenoid0937 5 hours ago | parent | prev [-]

[flagged]