Remix.run Logo
intoXbox 2 days ago

They used a custom neural net with autoencoders, which contain convolutional layers. They trained it on previous experiment data.

https://arxiv.org/html/2411.19506v1

Why is it so hard to elaborate what AI algorithm / technique they integrate? Would have made this article much better

dcanelhas 2 days ago | parent | next [-]

I'm half expecting to see "AI model" appearing as stand-in for "linear regression" at this point in the cycle.

ninjagoo 2 days ago | parent | next [-]

> I'm half expecting to see "AI model" appearing as stand-in for "linear regression" at this point in the cycle.

Already the case with consulting companies, have seen it myself

idiotsecant 2 days ago | parent [-]

Some career do-nothing-but-make-noise in my organization hired a firm to 'Do AI' on some shitty data and the outcome was basically linear regression. It turns out that you can impressive executives with linear regression if you deliver it enthusiastically enough.

tasuki 2 days ago | parent | next [-]

Tbh, often enough, linear regression is exactly what is needed.

idiotsecant 2 days ago | parent [-]

Yes, and we do it every day and call it 'linear regression' and don't need a data center full of expensive toys to do it

ozim 2 days ago | parent | prev [-]

Not everyone knows everything so knowledge is the new oil.

I do know about linear regression even had quite some of it at university.

But I still wouldn’t be able to just implement it on some data without good couple days to weeks of figuring things out and which tools to use so I don’t implement it from scratch.

idiotsecant a day ago | parent [-]

Implement it...from scratch? Its literally least squares regression. Its a few lines of code. What are you trying to say here?

ozim a day ago | parent [-]

You have to get the data first build all data processing pipelines to get your parameters for linear regression.

blitzar 2 days ago | parent | prev | next [-]

I'm half expecting to see "AI model" appearing as stand-in for "if > 0" at this point in the cycle.

Foobar8568 2 days ago | parent | next [-]

This is why I am programming now in Ocaml, files themselves are AI ( ml ).

srean 2 days ago | parent [-]

I am sure you did not forget that pattern matching.

Vetch 2 days ago | parent | prev [-]

This is essentially what any relu based neural network approximately looks like (smoother variants have replaced the original ramp function). AI, even LLMs, essentially reduce to a bunch of code like

    let v0 = 0
    let v1 = 0.40978399*(0.616*u + 0.291*v)
    let v2 = if 0 > v1 then 0 else v1

    let v3 = 0
    let v4 = 0.377928*(0.261*u + 0.468*v)
    let v5 = if 0 > v4 then 0 else v4...
samrus 2 days ago | parent [-]

Thats a bit far. Relu does check x>0 but thats just one non-linearity in the linear/non-linear sandwich that makes up universal function approximator theorem. Its more conplex than just x>0

Vetch 2 days ago | parent | next [-]

The relu/if-then-else is in fact centrally important as it enables computations with complex control flow (or more exactly, conditional signal flow or gating) schemes (particularly as you add more layers).

greenavocado 2 days ago | parent | prev [-]

Multiply-accumulate, then clamp negative values to zero. Every even-numbered variable is a weighted sum plus a bias (an affine transformation), and every odd-numbered variable is the ReLU gate (max(0, x)). Layer 2 feeds on the ReLU outputs of layer 1, and the final output is a plain linear combination of the last ReLU outputs

    // inputs: u, v
    // --- hidden layer 1 (3 neurons) ---
    let v0  = 0.616*u + 0.291*v - 0.135
    let v1  = if 0 > v0 then 0 else v0
    let v2  = -0.482*u + 0.735*v + 0.044
    let v3  = if 0 > v2 then 0 else v2
    let v4  = 0.261*u - 0.553*v + 0.310
    let v5  = if 0 > v4 then 0 else v4
    // --- hidden layer 2 (2 neurons) ---
    let v6  = 0.410*v1 - 0.378*v3 + 0.528*v5 + 0.091
    let v7  = if 0 > v6 then 0 else v6
    let v8  = -0.194*v1 + 0.617*v3 - 0.291*v5 - 0.058
    let v9  = if 0 > v8 then 0 else v8
    // --- output layer (binary classification) ---
    let v10 = 0.739*v7 - 0.415*v9 + 0.022
    // sigmoid squashing v10 into the range (0, 1)
    let out = 1 / (1 + exp(-v10))
GeorgeTirebiter 2 days ago | parent [-]

i let v0 = 0.616u + 0.291v - 0.135 let v1 = if 0 > v0 then 0 else v0

is there something 'less good' about:

    let v1  = if v0 < 0 then 0 else v0 
Am I the only one who stutter-parses "0 > value" vs my counterexample?

Is Yoda condition somehow better?

Shouldn't we write: Let v1 = max 0 v0

phire 2 days ago | parent | prev | next [-]

I'm sure I've seen basic hill climbing (and other optimisation algorithms) described as AI, and then used evidence of AI solving real-world science/engineering problems.

LiamPowell 2 days ago | parent | next [-]

Historically this was very much in the field of AI, which is such a massive field that saying something uses AI is about as useful as saying it uses mathematics. Since the term was first coined it's been constantly misused to refer to much more specific things.

From around when the term was first coined: "artificial intelligence research is concerned with constructing machines (usually programs for general-purpose computers) which exhibit behavior such that, if it were observed in human activity, we would deign to label the behavior 'intelligent.'" [1]

[1]: https://doi.org/10.1109/TIT.1963.1057864

zingar 2 days ago | parent [-]

That definition moves the goalposts almost by definition, people only stopped thinking that chess demonstrated intelligence when computers started doing it.

Eufrat 2 days ago | parent [-]

The term artificial intelligence has always been just a buzzword designed to sell whatever it needed to. IMHO, it has no meaningful value outside of a good marketing term. John McCarthy is usually the person who is given credit for coming up with the name and he has admitted in interviews that it was just to get eyeballs for funding.

coherentpony 2 days ago | parent | prev [-]

I am somewhat cynically waiting for the AI community to rediscover the last half a century of linear algebra and optimisation techniques.

At some point someone will realise that backpropagation and adjoint solves are the same thing.

bonoboTP 2 days ago | parent | next [-]

There are plenty of smart people in the "AI community" already who know it. Smugly commenting does not replace actual work. If you have real insight and can make something perform better, I guarantee you that many people will listen (I don't mean twitter influencers but the actual field). If you don't know any serious researcher in AI, I have my doubts that you have any insight to offer.

whattheheckheck 2 days ago | parent | prev [-]

I am sure they are aware...

thesz 2 days ago | parent | prev | next [-]

There is an HIGGS dataset [1]. As name suggest, it is designed to apply machine learning to recognize Higgs bozon.

[1] https://archive.ics.uci.edu/ml/datasets/HIGGS

In my experiments, linear regression with extended (addition of squared values) attributes is very much competitive in accuracy terms with reported MLP accuracy.

dguest 2 days ago | parent [-]

The LHC has moved on a bit since then. Here's an open dataset that one collaboration used to train a transformer:

https://opendata-qa.cern.ch/record/93940

if you can beat it with linear regression we'd be happy to know.

thesz a day ago | parent [-]

Thanks.

The paper [1] referenced in your link follows the lagacy of the paper on the HIGGS dataset, and does not operate with quantities like accuracy and/or perplexity. HIGGS dataset paper provided area under ROC, from which one had to approximate accuracy. I used accuracy from the ADMM paper [2] to compare my results with. As I checked later, area under ROC in [1] mostly agrees with [2] SGD training results on HIGGS.

  [1] https://arxiv.org/pdf/2505.19689
  [2] https://proceedings.mlr.press/v48/taylor16.pdf
I think that perplexity measure is appropriate there in [1] because we need to discern between three outcomes. This calls for softmax and for perplexity as a standard measure.

So, my questions are: 1) what perplexity should I target when dealing with "mc-flavtag-ttbar-small" dataset? And 2) what is the split of train/validate/test ratio there?

dguest 15 hours ago | parent [-]

For better or worse the people working on this don't really use perplexity or accuracy to evaluate models. The target is whatever you'd get for those metrics if you used the discriminants that were provided in the dataset (i.e. the GN2v01 values).

As for why accuracy and perplexity aren't reported: the experiments generally choose a threshold to consider something a "b-hadron" (basically picking a point along the ROC curve) and quantify the TPR and FPR at that point. There are reasons for this, mostly that picking a standard point lets them verify that the simulation actually reflects data. See, for example, the FPR [1] and TPR [2] "calibrations".

It's a good point, though, the physicists should probably try harder to report standard metrics that the rest of the ML community uses.

[1]: https://arxiv.org/pdf/2301.06319

[2]: https://arxiv.org/abs/1907.05120

yread 2 days ago | parent | prev | next [-]

And why not, when linear regression works, it works so well it's basically magic, better than intelligence, artificial or otherwise

plasino 2 days ago | parent | prev | next [-]

Having work with people who do that, I can guarantee that’s not the case. See https://ssummers.web.cern.ch/conifer/ and HSL4ML, these run BDT and CNN

Staross 2 days ago | parent | prev [-]

That works well to get around patents btw :)

etrautmann 2 days ago | parent | prev | next [-]

It seems like most of the implementation is FPGA, which I wouldn’t call “physically burned into silicon.” That’s quite a stretch of language

vultour 2 days ago | parent | prev | next [-]

Because if it’s not an LLM it’s not good for the current hype cycle. Calling everything AI makes the line go up.

danielbln 2 days ago | parent [-]

LLMs also make the cynicism go up among the HN crowd.

okamiueru 2 days ago | parent | next [-]

Hm. Is HN starting to become more skeptical of LLMs? For the past couple of years, HN has seemed worryingly enthusiastic about LLMs.

andersonpico 2 days ago | parent | prev [-]

How so? Half the people here have LLM delusion in every thread posted here; more than half of the things going to the frontpage are AI. Just look at hours where Americans are awake.

irishcoffee 2 days ago | parent [-]

Fucking Americans. Only 4% of the world population, with the magic of disproportionately afflicting the global news headlines which make their way here.

It’s impressive, honestly.

fnord77 2 days ago | parent | prev | next [-]

Thanks for tracking this down. I too am annoyed when so-called technical articles omit the actual techniques.

moffkalast a day ago | parent | prev | next [-]

Ah anomaly detection, that makes a lot more sense.

jgalt212 2 days ago | parent | prev [-]

Because it does not align with LLM Uber Alles.