Remix.run Logo
100721 7 hours ago

Does anyone know why they are using language models instead of a more purpose-built statistical model? My intuition is that a language model would either be overfit, or its training data would have a lot of noise unrelated to the application and significantly drive up costs.

LeoWattenberg 7 hours ago | parent | next [-]

It's not an LLM, it is a purpose built model. https://arxiv.org/html/2411.19506v1

5 years ago we would've called it a Machine Learning algorithm. 5 years before that, a Big Data algorithm.

IanCal 6 hours ago | parent | next [-]

We’ve been calling neural nets AI for decades.

> 5 years before that, a Big Data algorithm.

The DNN part? Absolutely not.

I don’t know why people feel the need for such revisionism but AI has been a field encompassing things far more basic than this for longer than most commenters have been alive.

magicalhippo 6 hours ago | parent [-]

> AI has been a field encompassing things far more basic than this for longer than most commenters have been alive.

When I was 13, having just started programming, I picked up a book from a "junk bin" at a book store on Artificial Intelligence. It must have been from the mid-80s if not older.

It had an entire chapter on syllogism[1] and how to implement a program to spit them out based on user input. As I recall it basically amounted to some string exteaction assuming user followed a template and string concatenation to generate the result. I distinctly recall not being impressed about such a trivial thing being part of a book on AI.

[1]: https://en.wikipedia.org/wiki/Syllogism

rjh29 6 hours ago | parent [-]

Eliza was 1960s.

In the 1990s I remember taking my friend's IRC chat history and running it through a Markov model to generate drivel, which was really entertaining.

t0lo 6 hours ago | parent | prev [-]

i hate that we're in this linguistic soup when it comes to algorithmic intelligence now.

kevmo314 7 hours ago | parent | prev | next [-]

This might be some journalistic confusion. If you go to the CERN documentation at https://twiki.cern.ch/twiki/bin/view/CMSPublic/AXOL1TL2025 it states

> The AXOL1TL V5 architecture comprises a VICReg-trained feature extractor stacked on top of a VAE.

dmd 6 hours ago | parent | prev [-]

… they’re not? Who said they are? The article even explicitly says they’re not?

progval 5 hours ago | parent [-]

For 40 minutes, the article claimed they used LLMs. They changed the wording twice: https://theopenreader.org/index.php?title=Journalism:CERN_Us... and https://theopenreader.org/index.php?title=Journalism%3ACERN_...