EDIT: please change the article link to the most recent version (as of now still v2), it is currently pointing to the v1 version which misses the figures.

I'm still reading this, but if this checks out, this is one of the most significant discoveries in years.

Why use splines or polynomials or haphazardly chosen basis functions if you can just fit (gradient descent) your data or wave functions to the proper computational EML tree?

Got a multidimensional and multivariate function to model (with random samples or a full map)? Just do gradient descent and convert it to approximant EML trees.

Perform gradient descent on EML function tree "phi" so that the derivatives in the Schroedinger equation match.

But as I said, still reading, this sounds too good to be true, but I have witnessed such things before :)

▲

ikrima 5 hours ago | parent | next [-]

From my experience of working in this problem domain for the last year, I'd say it is pretty powerful but the "too good to be true part" comes from that EML buys elegance through exponential expression blow-up. Multiplication alone requires depth-8 trees with 41+ leaves i.e. minimal operator vocabulary trades off against expression length. There's likely an information-theoretic sweet spot between these extremes.

It's interesting to see his EML approach whereas mine was more on generating a context sensitive homoiconic grammar.

I've had lots of success combining spectral neural nets (GNNs, FNOs, Neural Tangent Kernels) with symbolic regression and using Operad Theory and Category Theory as my guiding mathematical machinery

▲

DoctorOetker 3 hours ago | parent | next [-]

In my experience this exponential expression blow-up is less the result of the approach of decomposing into a minimum of primitives, but rather a result from repetition in expression trees:

If we make the analogy from Bertrand Russel's Principia Mathematica, he derived fully expanded expressions, i.e. trees where the leaves only may refer to the same literals, everyone claimed this madness underscored how formal verification of natural mathematics was a fools errand, but nevertheless we see successful projects like metamath (us.metamath.org) where this exponential blow-up does not occur. It is easy to see why: instead of representing proofs as full trees, the proofs are represented as DAG's. The same optimization would be required for EML to prevent exponential blow-up.

Put differently: if we allow extra buttons besides {1, EML} for example to capture unary functions the authors mentally add an 'x' button so now the RPN calculator has {1, EML, x}; but wait if you want multivariate functions it becomes an RPN calculator with extra buttons {1, EML, x,y,z} for example.

But why stop there? in metamath proofs are compressed: if an expression or wff was proven before in the same proof, it first subproof is given a number, and any subsequent invocations of this N'th subproof refers to this number. Why only recall input parameters x,y,z but not recall earlier computed values/functions?

In fact every proof in metamath set.mm that uses this DAG compressibility, could be split into the main proof and the repeatedly used substatements could be automatically converted to explicitly separate lemma proofs, in which case metamath could dispose of the single-proof DAG compression (but it would force proofs to split up into lemma's + main proof.

None of the proven theorems in metamath's set.mm displays the feared exponential blowup.

▲

gopalv 5 hours ago | parent | prev | next [-]

> Multiplication alone requires depth-8 trees with 41+ leaves i.e. minimal operator vocabulary trades off against expression length.

That is sort of comparable to how NAND simplify scaling.

Division is hell on gates.

The single component was the reason scaling went like it did.

There was only one gate structure which had to improve to make chips smaller - if a chip used 3 different kinds, then the scaling would've required more than one parallel innovation to go (sort of like how LED lighting had to wait for blue).

If you need two or more components, then you have to keep switching tools instead of hammer, hammer, hammer.

▲

tripletao 4 hours ago | parent | next [-]

I'm not sure what you mean by this? It's true that any Boolean operation can be expressed in terms of two-input NAND gates, but that's almost never how real IC designers work. A typical standard cell library has lots of primitives, including all common gates and up to entire flip-flops and RAMs, each individually optimized at a transistor level. Realization with NAND2 and nothing else would be possible, but much less efficient.

Efficient numerical libraries likewise contain lots of redundancy. For example, sqrt(x) is mathematically equivalent to pow(x, 0.5), but sqrt(x) is still typically provided separately and faster. Anyone who thinks that eml() function is supposed to lead directly to more efficient computation has missed the point of this (interesting) work.

▲

meindnoch 2 hours ago | parent | prev [-]

Are you under the impression that CPUs are made exclusively from NAND gates? You can't be serious.

▲

peyton 2 hours ago | parent [-]

Might’ve gotten mixed up with CMOS dominance, or I’m ignorant.

	▲	giik an hour ago \| parent [-]
		I believe you're not ignorant. But many folks probably lack the process knowledge (CMOS) required to understand why :-)

▲

canjobear 5 hours ago | parent | prev [-]

Link to your work?

▲

gilgoomesh 6 hours ago | parent | prev | next [-]

> Why use splines or polynomials or haphazardly chosen basis functions if you can just fit (gradient descent) your data or wave functions to the proper computational EML tree?

Same reason all boolean logic isn't performed with combinations of NAND – it's computationally inefficient. Polynomials are (for their expressivity) very quick to compute.

▲

brucehoult 4 hours ago | parent | next [-]

The Cray 1 was built 100% from NOR gates and SRAM.

▲

Nevermark 5 hours ago | parent | prev [-]

They are done with transistors though. Transistors form an efficient, single element, universal digital basis.

And are a much less arbitrary choice than NAND, vs. NOR, XOR, etc.

Using transistors as conceptual digital logic primitives, where power dissipation isn't a thing, Pass Logic is "The Way".

▲

samus 3 hours ago | parent | next [-]

Single transistors aren't yet logic gates by themselves; they are amplifiers with a very specific gain function that makes it possible to use them as switches. Logic gates usually consist of at least two transistors. See https://en.wikipedia.org/wiki/CMOS for an example of how it is done in CMOS technology.

	▲	adrian_b 2 hours ago \| parent \| next [-]
		That is correct. Moreover, the amplifying function must exist at least in some gates, to restore the logic levels, but there is no need for it to exist in all gates. For instance, any logic circuit can be implemented using AND gates and/or OR gates made with diodes, which have no gain, i.e. no amplifying function, together with 1-transistor inverters, which provide both the NOT function and the gain needed to restore the logic levels. The logic functions such as NAND can be expressed in several way using simpler components, which correspond directly with the possible hardware implementations. Nowadays, the most frequent method of logic implementation is by using parallel and series connections of switches, for which the MOS transistors are well suited. Another way to express the logic functions is by using the minimum and maximum functions, which correspond directly with diode-based circuits. All the logic functions can also be expressed using the 2 operations of the binary finite field, addition (a.k.a. XOR) and multiplication (a.k.a. AND). This does not lead to simpler hardware implementations, but it can simplify some theoretical work, by using algebraic results. Actually this kind of expressing logic is the one that should have been properly named as Boolean logic, as the contribution of George Boole has been precisely replacing "false" and "true" with "0" and "1" and reinterpreting the classic logic operations as arithmetic operations. It is very weird to see in some programming languages a data type named Boolean, whose possible values are "false" and "true", instead of the historically correct Boolean values, "0" and "1", which can also be much more useful in programming than "false" and "true", by simplifying expressions, especially in array operations (which is why array-oriented languages like APL use "0" and "1", not "false" and "true").
	▲	Nevermark an hour ago \| parent \| prev [-]
		"Pass transistors" are transistors being operated in pass/impedance switch mode. Pass logic. Digital. [0] This is extremely basic digital circuit design. You can create digital circuits as compositions of gates. But you can often implement the same logic, with fewer transistors, using pass logic. Pass logic is also great for asynchronous digital design. [0] https://en.wikipedia.org/wiki/Pass_transistor_logic

▲

throw-qqqqq an hour ago | parent | prev [-]

> Transistors form an efficient, single element, universal digital basis

But transistors can be N or P-channel, so it’s not a single logical primitive, like e.g. NAND-gates.

▲

canjobear 5 hours ago | parent | prev | next [-]

> Why use splines or polynomials or haphazardly chosen basis functions if you can just fit (gradient descent) your data or wave functions to the proper computational EML tree?

Because the EML basis makes simple functions (like +) hard to express.

Not to diminish this very cool discovery!

	▲	DoctorOetker 2 hours ago \| parent [-]
		consider how a wavefunction might be stored for numerically searching the ground state of a molecule, or a crystal lattice, humans appreciate 2D imagery that is about 1 kilopixel x 1 kilopixel; so expanding this to 3 dimensions that means the wavefunction would have to store 10 ^ 9 complex numbers (easily 4 GB at 16 bit precision for real and imaginary compnents so 4 bytes per complex number), do we really believe that a DAG variant of the EML construction would consume a bigger value to represent the analytically correct solution? do we really believe that the 4GB DAG variant of EML would produce a less accurate representation (i.e. less fidelity with the schroedinger equation?) If the ground state hydrogen atom is any indication, my money is on EML-style constructions, not naive 3D arrays modelling the wavefunction by brute force. This also re-opens a lot of "party pooper" results in mathematics: impossibility of representing solutions to general quintic (fine print: if we restrict ourselves to arithmetic and roots/radicals). In mathematics and physics there have been a lot of "party pooper" results which later found more profound and interesting positive results by properly rephrasing the question. A negative result for a myopic question isn't very informative on its own.

▲

eru 6 hours ago | parent | prev | next [-]

> I'm still reading this, but if this checks out, this is one of the most significant discoveries in years.

It seems like a neat parlour trick, indeed. But significant discovery?

▲

5 hours ago | parent | prev | next [-]

[deleted]

▲

dang 4 hours ago | parent | prev [-]

What URL should we change it to?

▲

lioeters 4 hours ago | parent | next [-]

The URL is already pointing at v2, which apparently is the newest one requested by the comment above.

> Submitted on 23 Mar 2026 (v1), last revised 4 Apr 2026 (this version, v2)

▲

dang 4 hours ago | parent [-]

Thanks! that's what it looked like to me too.

	▲	DoctorOetker 2 hours ago \| parent [-]
		I agree: I don't understand what happened, but the first "View PDF" resulted in a PDF where the hyperlinks to the figures didn't work. Upon closer inspection it wasn't v1 at all, thats a PNAS article. I am unable to remove the EDIT:... line in my original comment at this time...

▲

fc417fc802 2 hours ago | parent | prev [-]

IMO arxiv links should pretty much always be to the abstract (ie .../abs/...) as opposed to particular versions of the html or pdf.