CERN uses tiny AI models burned into silicon for real-time LHC data filtering

intoXbox 3 hours ago | parent | next [-]

They used a custom neural net with autoencoders, which contain convolutional layers. They trained it on previous experiment data.

https://arxiv.org/html/2411.19506v1

Why is it so hard to elaborate what AI algorithm / technique they integrate? Would have made this article much better

▲

dcanelhas 3 hours ago | parent | next [-]

I'm half expecting to see "AI model" appearing as stand-in for "linear regression" at this point in the cycle.

▲

plasino a minute ago | parent | next [-]

Having work with people who do that, I can guarantee that’s not the case. See https://ssummers.web.cern.ch/conifer/ and HSL4ML, these run BDT and CNN

▲

ninjagoo 2 hours ago | parent | prev | next [-]

> I'm half expecting to see "AI model" appearing as stand-in for "linear regression" at this point in the cycle.

Already the case with consulting companies, have seen it myself

▲

blitzar 2 hours ago | parent | prev | next [-]

I'm half expecting to see "AI model" appearing as stand-in for "if > 0" at this point in the cycle.

	▲	Foobar8568 20 minutes ago \| parent [-]
		This is why I am programming now in Ocaml, files themselves are AI ( ml ).

▲

phire 3 hours ago | parent | prev | next [-]

I'm sure I've seen basic hill climbing (and other optimisation algorithms) described as AI, and then used evidence of AI solving real-world science/engineering problems.

▲

LiamPowell 2 hours ago | parent [-]

Historically this was very much in the field of AI, which is such a massive field that saying something uses AI is about as useful as saying it uses mathematics. Since the term was first coined it's been constantly misused to refer to much more specific things.

From around when the term was first coined: "artificial intelligence research is concerned with constructing machines (usually programs for general-purpose computers) which exhibit behavior such that, if it were observed in human activity, we would deign to label the behavior 'intelligent.'" [1]

[1]: https://doi.org/10.1109/TIT.1963.1057864

▲

zingar 2 hours ago | parent [-]

That definition moves the goalposts almost by definition, people only stopped thinking that chess demonstrated intelligence when computers started doing it.

	▲	Eufrat 2 hours ago \| parent [-]
		The term artificial intelligence has always been just a buzzword designed to sell whatever it needed to. IMHO, it has no meaningful value outside of a good marketing term. John McCarthy is usually the person who is given credit for coming up with the name and he has admitted in interviews that it was just to get eyeballs for funding.

▲

yread an hour ago | parent | prev [-]

And why not, when linear regression works, it works so well it's basically magic, better than intelligence, artificial or otherwise

▲

etrautmann 2 hours ago | parent | prev | next [-]

It seems like most of the implementation is FPGA, which I wouldn’t call “physically burned into silicon.” That’s quite a stretch of language

▲

vultour 2 hours ago | parent | prev [-]

Because if it’s not an LLM it’s not good for the current hype cycle. Calling everything AI makes the line go up.

	▲	danielbln a few seconds ago \| parent [-]
		LLMs also make the cynicism go up among the HN crowd.

▲

serendipty01 4 hours ago | parent | prev | next [-]

Might be related: https://www.youtube.com/watch?v=T8HT_XBGQUI (Big Data and AI at the CERN LHC by Dr. Thea Klaeboe Aarrestad)

https://www.youtube.com/watch?v=8IZwhbsjhvE (From Zettabytes to a Few Precious Events: Nanosecond AI at the Large Hadron Collider by Thea Aarrestad)

Page: https://www.scylladb.com/tech-talk/from-zettabytes-to-a-few-...

▲

konradha an hour ago | parent | prev | next [-]

How are FPGAs "bruned into silicon"? Would be news to me that there are ASICs being taped out at CERN

	▲	eqvinox 42 minutes ago \| parent \| next [-]
		CERN in fact does design custom ASICs for other things: https://indico.cern.ch/event/1115079/contributions/4693643/a... (Probably not for this here though.)
	▲	danparsonson 43 minutes ago \| parent \| prev [-]
		Could they.... have someone else do it for them?

▲

quantum_state 42 minutes ago | parent | prev | next [-]

CERN has been doing HEP experiments for decades. What did it use before the current incarnation of AI? The AI label seems to be more marketing and superficial than substantial. It’s a bit sad that a place like CERN feels the need to make it public that it is on the bandwagon.

▲

eqvinox 41 minutes ago | parent [-]

It doesn't say LLM anywhere.

	▲	quantum_state 37 minutes ago \| parent [-]
		Good catch. Corrected. Thanks!

▲

TORcicada 23 minutes ago | parent | prev | next [-]

Thanks for the thoughtful comments and links really appreciated the high-signal feedback. We've updated the article to better reflect the actual VAE-based AXOL1TL architecture (variational autoencoder for anomaly detection). Added the arXiv paper and Thea Aarrestad's talks to the Primary Sources.

▲

quijoteuniv 4 hours ago | parent | prev | next [-]

A bit of hype in the AI wording here. This could be called a chip with hardcoded logic obtained with machine learning

▲

FartyMcFarter 3 hours ago | parent | next [-]

AI is not a new thing, and machine learned logic definitely counts as AI.

	▲	monkeydust 3 hours ago \| parent \| next [-]
		For those that have experience with ML, yes. For those that have recently become acquainted with it (more on business side) they seem to really struggle with this in my experience. '
	▲	volemo 2 hours ago \| parent \| prev [-]
		Yeah, and don’t forget Eliza!

▲

killingtime74 4 hours ago | parent | prev [-]

Is a LLM logic in weights derived from machine learning?

▲

shlewis 3 hours ago | parent | next [-]

Well, yes. That's literally what it is.

▲

dmd 3 hours ago | parent [-]

What what is? The article has nothing to do with LLMs. It even explicitly says they don’t use LLMs.

	▲	shlewis an hour ago \| parent [-]
		> Is a LLM logic in weights derived from machine learning? I was just answering this question. LLM logic in weights is fundamentally from machine learning, so yes. Wasn't really saying anything about the article.

▲

quijoteuniv 3 hours ago | parent | prev [-]

Good one… but Is a DB query filter AI? I forgot to say though is sounds like a really cool thing to do

▲

stingraycharles 3 hours ago | parent [-]

Strictly speaking, expert systems are AI as well, as in, an expert comes up with a bunch of if/else rules. So yes technically speaking even if they didn’t acquire the weights using ML and hand-coded them, it could still be called AI.

▲

phire 3 hours ago | parent [-]

It is 100% valid to label an algorithm that plays tic-tac-toe as "AI"

Much of the early AI research was spent on developing various algorithms that could play board games.

Didn't even need computers, one early AI was MENACE [1], a set of 304 matchboxes which could learn how to play noughts and crosses.

[1] https://en.wikipedia.org/wiki/Matchbox_Educable_Noughts_and_...

	▲	stingraycharles 2 hours ago \| parent [-]
		Yup this is exactly my point, in the 80s there were plenty of “AI” companies and “fuzzy logic” was the buzzword of the day.

▲

armcat an hour ago | parent | prev | next [-]

Not on the same extreme level, but I know that some coffee machines use a tiny CNN based model locally/embedded. There is a small super cheap camera integrated in the coffee machine, and the model does three things: (1) classifies the container type in order to select type of coffee, (2) image segmentation - to determine where the cup/hole is placed, (3) regression - to determine the volume and regulate how much coffee to pour.

▲

Surac 2 hours ago | parent | prev | next [-]

Very important! This is not a LLM like the ones so often called AI these days. Its a neural network in a FPGA.

	▲	duskdozer an hour ago \| parent \| next [-]
		I guess shows the LLM-companies' marketing worked very well because that's what I immediately thought of.
	▲	IshKebab 2 hours ago \| parent \| prev [-]
		> FPGA So they aren't "burned into silicon" then? The article mentions FPGAs and ASICs but it's a bit vague. I would be surprised if ASICs actually made sense here.

▲

WhyNotHugo 3 hours ago | parent | prev | next [-]

Intuitively, I’ve always had an impression that using an analogue circuit would be feasible for neural networks (they just matrix multiplication!). These should provide instantaneous output.

Isn’t this kind of approach feasible for something so purpose-built?

	▲	incognito124 2 hours ago \| parent [-]
		You might wanna look at https://taalas.com/

▲

rakel_rakel 4 hours ago | parent | prev | next [-]

Hey Siri, show me an example of an oxymoron!

> CERN is using extremely small, custom large language models physically burned into silicon chips to perform real-time filtering of the enormous data generated by the Large Hadron Collider (LHC).

▲

sh3rl0ck 4 hours ago | parent | next [-]

There's no mention of SLMs or LLMs, though.

> This work represents a compelling real-world demonstration of “tiny AI” — highly specialised, minimal-footprint neural networks

FPGAs for Neural Networks have been s thing since before the LLM era.

▲

100721 4 hours ago | parent [-]

Huh? The first paragraph literally says they are using LLMs

> [ GENEVA, SWITZERLAND — March 28, 2026 ] — CERN is using extremely small, custom large language models physically burned into silicon chips to perform real-time filtering of the enormous data generated by the Large Hadron Collider (LHC).

▲

SiempreViernes 4 hours ago | parent [-]

the site might have fixed it, to me it says "artificial intelligence" instead of LLM, still bad but not" steaming pile of poo on you bank statement" bad

	▲	progval 2 hours ago \| parent [-]
		They changed it from AI to LLM then back to AI: https://theopenreader.org/index.php?title=Journalism:CERN_Us... and https://theopenreader.org/index.php?title=Journalism%3ACERN_...

▲

msla 4 hours ago | parent | prev [-]

Are they some ancient small-scale integration VLSI design? Do they broadcast on a low-frequency VHF band? Face it: Oxymorons like those are part of the technical world. "VLSI" was a current term back when whole CPUs were made out of fewer transistors than we use for register files now, and "VHF" is low frequency even by commercial broadcasting standards.

	▲	rakel_rakel 4 hours ago \| parent [-]
		haha, yea they are part of it for sure, and I'm not dunking on the use of them, but I rather smile a bit when I stumble upon them. Like (~9K) Jumbo Frames!

▲

Janicc 3 hours ago | parent | prev | next [-]

I think chips having a single LLM directly on them will be very common once LLMs have matured/reached a ceiling.

▲

v9v 3 hours ago | parent | prev | next [-]

Do they actually have ASICs or just FPGAs? The article seems a bit unclear.

▲

mentalgear 3 hours ago | parent | prev | next [-]

That's what Groq did as well: burning the Transformer right onto a chip (I have to say I was impressed by the simplicity, but afterwards less so by their controversial Kushner/Saudi investment) .

	▲	NitpickLawyer 3 hours ago \| parent [-]
		> That's what Groq did as well: burning the Transformer right onto a chip Are you perhaps confusing Groq with the Etched approach? IIUC Etched is the company that "burned the transformer onto a chip". Groq uses LPUs that are more generalist (they can run many transformers and some other architectures) and their speed comes from using SRAM.

▲

nerolawa 3 hours ago | parent | prev | next [-]

the fact that 99% of LHC data is just gone forever is insane

▲

seydor 3 hours ago | parent | prev | next [-]

cern has been using neural networks for decades

▲

randomNumber7 4 hours ago | parent | prev | next [-]

Does string theory finally make sense when we ad AI hallucinations?

	▲	quantum_state 33 minutes ago \| parent [-]
		This is a good one

▲

amelius 2 hours ago | parent | prev | next [-]

When is the price of fabbing silicon coming down, so every SMB can do it?

	▲	IshKebab 2 hours ago \| parent [-]
		My guess would be never. The closest you can get is "multi project wafers" where you get bundled with a load of other projects. As I understand it they're on the order of $100k which is cheap, but if you actually want to design and verify a chip you're looking at at least several million in salaries and software costs. Probably more like $10m, especially if you're paying US salaries. And of course that would be for a low performance design. I think a better question would be "when are FPGAs going to stop being so ridiculously overpriced". That feels more possible to me (but still unlikely).

▲

100721 4 hours ago | parent | prev [-]

Does anyone know why they are using language models instead of a more purpose-built statistical model? My intuition is that a language model would either be overfit, or its training data would have a lot of noise unrelated to the application and significantly drive up costs.

▲

LeoWattenberg 4 hours ago | parent | next [-]

It's not an LLM, it is a purpose built model. https://arxiv.org/html/2411.19506v1

5 years ago we would've called it a Machine Learning algorithm. 5 years before that, a Big Data algorithm.

▲

IanCal 3 hours ago | parent | next [-]

We’ve been calling neural nets AI for decades.

> 5 years before that, a Big Data algorithm.

The DNN part? Absolutely not.

I don’t know why people feel the need for such revisionism but AI has been a field encompassing things far more basic than this for longer than most commenters have been alive.

▲

magicalhippo 3 hours ago | parent [-]

> AI has been a field encompassing things far more basic than this for longer than most commenters have been alive.

When I was 13, having just started programming, I picked up a book from a "junk bin" at a book store on Artificial Intelligence. It must have been from the mid-80s if not older.

It had an entire chapter on syllogism[1] and how to implement a program to spit them out based on user input. As I recall it basically amounted to some string exteaction assuming user followed a template and string concatenation to generate the result. I distinctly recall not being impressed about such a trivial thing being part of a book on AI.

[1]: https://en.wikipedia.org/wiki/Syllogism

	▲	rjh29 3 hours ago \| parent [-]
		Eliza was 1960s. In the 1990s I remember taking my friend's IRC chat history and running it through a Markov model to generate drivel, which was really entertaining.

▲

t0lo 4 hours ago | parent | prev [-]

i hate that we're in this linguistic soup when it comes to algorithmic intelligence now.

▲

kevmo314 4 hours ago | parent | prev | next [-]

This might be some journalistic confusion. If you go to the CERN documentation at https://twiki.cern.ch/twiki/bin/view/CMSPublic/AXOL1TL2025 it states

> The AXOL1TL V5 architecture comprises a VICReg-trained feature extractor stacked on top of a VAE.

▲

dmd 3 hours ago | parent | prev [-]

… they’re not? Who said they are? The article even explicitly says they’re not?

	▲	progval 2 hours ago \| parent [-]
		For 40 minutes, the article claimed they used LLMs. They changed the wording twice: https://theopenreader.org/index.php?title=Journalism:CERN_Us... and https://theopenreader.org/index.php?title=Journalism%3ACERN_...