Remix.run Logo
thethirdone 12 hours ago

> the ratio remains approximately 914x over TurboQuant, with compression improving rather than degrading as context length grows.

This line from the abstract got me really suspicious. Obviously a compression scheme that incorporates the entire sequence shouldn't get worse compared to a per element one as the length increases.

It is important to note that this paper is PURELY theoretical. I couldn't find much meat on the bone from a quick skim.

The single author, Gregory Magarshak, has only published one paper on arxiv before and appears to be a professor of business / music. I don't plan to give it more of a read hoping for something of value.

stingraycharles 12 hours ago | parent | next [-]

Me neither. There are no actual experiments / data, no peer reviews, and the innovation relies almost entirely on citations from the author’s other paper.

The author is not an ML researcher but rather an AI startup CTO / founder. Previously worked on “social operating systems” for the web, blockchain of course. And now an AI innovator. I’m suspicious. This was part of the author’s reply in another thread:

> When TurboQuant came out, I realized we can also go way below the Shannon limit in the same way, and take advantage of PLT. In fact, I'm working on publishing a paper that generalizes this to robotics (which needs to do cheap fast on-board inference "in the field"). I also believe this is how animals actually learn. In other words, over time they learn overall "sequences" of actions and then can check whether they are "good enough" to solve the problem, or whether to switch to a full analysis -- this corresponds to System 1 and 2 of Daniel Kahneman's "Thinking Fast and Slow".

Which doesn’t exactly inspire confidence and makes me wonder who they think their audience is. ML researchers or LinkedIn.

gaze 12 hours ago | parent | prev | next [-]

the irritating thing about LLM generated papers like these is that they're wrong but are generated using LLMs that are capable enough to bury the absurd claim pretty deep in there.

stingraycharles 12 hours ago | parent | next [-]

Analyze it using an LLM. Claude was pretty ruthless about this one.

thethirdone 12 hours ago | parent | next [-]

Yeah, for me Claude identified the phrase "this holds with probability 1 over random weight matrices since the null space has dimension"

Treating trained weights as random for the purpose of a proof is immediately discrediting for a paper to me.

EGreg 11 hours ago | parent [-]

"This holds for almost all matrices" is actually something you'd want to know if we're talking about probabilities, no?

gaze 12 hours ago | parent | prev [-]

sure but it seems spiritually wrong to use an LLM to debug a slop paper. Who knows, maybe claude generated it in the first place?

Hard_Space 5 hours ago | parent | prev [-]

In fairness, real-world researchers are expert at selective emphasis too.

EGreg 12 hours ago | parent | prev [-]

You're right, I'm not a well-known researcher, simply an entrepreneur who started to publish academic papers.

However, I do have a long history of diving deep into fields and building practical, open-source solutions to major problems I perceive in the fields.

15 years ago I started with social networks and PHP: https://github.com/Qbix http://laweekly.com/restoring-healthy-communities/

8 years ago I got into smart contracts on EVM, which was the SOTA at the time: https://github.com/Intercoin https://intercoin.org/applications

About a year and a half ago I started teaching a course on AI at a university not far from NYU where I studied... and that's what got me into this: https://vimeo.com/1063008765/c7ef3abcc5

I try to document everything on GitHub and popular articles, but only recently started publishing academic papers on arXiv and plan to actually start submitting them for real publications. While I build, I realized that I should start publishing any novel theoretical results that underpin my work.

I plan to publish actual code in a few weeks. To be fair, TurboQuant is also a purely theoretical paper. I just wanted to get this out and share.

thethirdone 11 hours ago | parent | next [-]

> To be fair, TurboQuant is also a purely theoretical paper. I just wanted to get this out and share.

TurboQuant is not a purely theoretical paper. Section 4 "Experiments" (page 15) [0] has a bunch of figure based on actual GPU computations.

[0]: https://arxiv.org/abs/2504.19874

stingraycharles 11 hours ago | parent | prev [-]

TurboQuant went through ICLR review, has multiple Google Research co-authors, open-source implementations, CUDA kernels, and LongBench benchmarks.

Contrast that with your paper: no experiments, no implementation, no empirical validation of any kind.

Did you try engaging with LLM researchers and get their feedback on your paper?