| ▲ | thethirdone 12 hours ago | |||||||||||||||||||||||||||||||||||||
> the ratio remains approximately 914x over TurboQuant, with compression improving rather than degrading as context length grows. This line from the abstract got me really suspicious. Obviously a compression scheme that incorporates the entire sequence shouldn't get worse compared to a per element one as the length increases. It is important to note that this paper is PURELY theoretical. I couldn't find much meat on the bone from a quick skim. The single author, Gregory Magarshak, has only published one paper on arxiv before and appears to be a professor of business / music. I don't plan to give it more of a read hoping for something of value. | ||||||||||||||||||||||||||||||||||||||
| ▲ | stingraycharles 12 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||
Me neither. There are no actual experiments / data, no peer reviews, and the innovation relies almost entirely on citations from the author’s other paper. The author is not an ML researcher but rather an AI startup CTO / founder. Previously worked on “social operating systems” for the web, blockchain of course. And now an AI innovator. I’m suspicious. This was part of the author’s reply in another thread: > When TurboQuant came out, I realized we can also go way below the Shannon limit in the same way, and take advantage of PLT. In fact, I'm working on publishing a paper that generalizes this to robotics (which needs to do cheap fast on-board inference "in the field"). I also believe this is how animals actually learn. In other words, over time they learn overall "sequences" of actions and then can check whether they are "good enough" to solve the problem, or whether to switch to a full analysis -- this corresponds to System 1 and 2 of Daniel Kahneman's "Thinking Fast and Slow". Which doesn’t exactly inspire confidence and makes me wonder who they think their audience is. ML researchers or LinkedIn. | ||||||||||||||||||||||||||||||||||||||
| ▲ | gaze 12 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
the irritating thing about LLM generated papers like these is that they're wrong but are generated using LLMs that are capable enough to bury the absurd claim pretty deep in there. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | EGreg 12 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||
You're right, I'm not a well-known researcher, simply an entrepreneur who started to publish academic papers. However, I do have a long history of diving deep into fields and building practical, open-source solutions to major problems I perceive in the fields. 15 years ago I started with social networks and PHP: https://github.com/Qbix http://laweekly.com/restoring-healthy-communities/ 8 years ago I got into smart contracts on EVM, which was the SOTA at the time: https://github.com/Intercoin https://intercoin.org/applications About a year and a half ago I started teaching a course on AI at a university not far from NYU where I studied... and that's what got me into this: https://vimeo.com/1063008765/c7ef3abcc5 I try to document everything on GitHub and popular articles, but only recently started publishing academic papers on arXiv and plan to actually start submitting them for real publications. While I build, I realized that I should start publishing any novel theoretical results that underpin my work. I plan to publish actual code in a few weeks. To be fair, TurboQuant is also a purely theoretical paper. I just wanted to get this out and share. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||