| ▲ | 100721 7 hours ago |
| Does anyone know why they are using language models instead of a more purpose-built statistical model? My intuition is that a language model would either be overfit, or its training data would have a lot of noise unrelated to the application and significantly drive up costs. |
|
| ▲ | LeoWattenberg 7 hours ago | parent | next [-] |
| It's not an LLM, it is a purpose built model. https://arxiv.org/html/2411.19506v1 5 years ago we would've called it a Machine Learning algorithm. 5 years before that, a Big Data algorithm. |
| |
| ▲ | IanCal 6 hours ago | parent | next [-] | | We’ve been calling neural nets AI for decades. > 5 years before that, a Big Data algorithm. The DNN part? Absolutely not. I don’t know why people feel the need for such revisionism but AI has been a field encompassing things far more basic than this for longer than most commenters have been alive. | | |
| ▲ | magicalhippo 6 hours ago | parent [-] | | > AI has been a field encompassing things far more basic than this for longer than most commenters have been alive. When I was 13, having just started programming, I picked up a book from a "junk bin" at a book store on Artificial Intelligence. It must have been from the mid-80s if not older. It had an entire chapter on syllogism[1] and how to implement a program to spit them out based on user input. As I recall it basically amounted to some string exteaction assuming user followed a template and string concatenation to generate the result. I distinctly recall not being impressed about such a trivial thing being part of a book on AI. [1]: https://en.wikipedia.org/wiki/Syllogism | | |
| ▲ | rjh29 6 hours ago | parent [-] | | Eliza was 1960s. In the 1990s I remember taking my friend's IRC chat history and running it through a Markov model to generate drivel, which was really entertaining. |
|
| |
| ▲ | t0lo 6 hours ago | parent | prev [-] | | i hate that we're in this linguistic soup when it comes to algorithmic intelligence now. |
|
|
| ▲ | kevmo314 7 hours ago | parent | prev | next [-] |
| This might be some journalistic confusion. If you go to the CERN documentation at https://twiki.cern.ch/twiki/bin/view/CMSPublic/AXOL1TL2025 it states > The AXOL1TL V5 architecture comprises a VICReg-trained feature extractor stacked on top of a VAE. |
|
| ▲ | dmd 6 hours ago | parent | prev [-] |
| … they’re not? Who said they are? The article even explicitly says they’re not? |
| |