Remix.run Logo
vineyardmike 6 hours ago

> Some random person discovered a 60% across the board gain in all LLMs, using an extremely simple trick that none of the labs noticed in all these years of multi-trillion dollar growth

DeepSeek published a pretty well circulated paper on exactly this many months ago. It just hasn’t been attempted and shared publicly, asa retrofit, AFAIK.

Also, it’s no free lunch, the readme indicates that this “use images” hack is lossy and reduces success rates alongside the reduced cost. Most labs would focus on success increases regardless of price.

geor9e 6 hours ago | parent [-]

If the trick were genuinely useful, and was well circulated months ago, the resource-starved inference providers would have squeezed this trick dry already, instead of wasting 60% of their tokens, waiting for users to implement it themselves in 5 minutes of effort.

Klathmon 3 hours ago | parent | next [-]

That's like saying quantization isn't real because the frontier labs aren't using it in their production inference.

This is a lossy process, it produces worse results. It might be worth it for some situations, but applying it to everything would just be making your SOTA model worse

ptx 2 hours ago | parent [-]

Isn't this just quantization with extra steps? Can converting the text to an image really be a better way to lossily compress it? (Not that I have any idea what I'm talking about on this topic.)

Klathmon an hour ago | parent [-]

I also have no idea what I'm talking about, but to me this seems closer to the "caveman mode" that some people use to compress info into fewer tokens. Going through the image tokenizer allows you to leave the source text untouched while still gaining (some of?) the benefits

solenoid0937 5 hours ago | parent | prev [-]

[flagged]