These are technical details of computations that are performed as part of LLMs.

Completely pointless to anyone who is not writing the lowest level ML libraries (so basically everyone). This does now help anyone understand how LLMs actually work.

This is as if you started explaining how an ICE car works by diving into chemical properties of petrol. Yeah that really is the basis of it all, but no it is not where you start explaining how a car works.

▲

jasode 5 days ago | parent | next [-]

>This is as if you started explaining how an ICE car works by diving into chemical properties of petrol.

But wouldn't explaining the chemistry actually be acceptable if the title was, "The chemistry you need to start understanding Internal Combustion Engines"

That's analogous to what the author did. The title was "The maths ..." -- and then the body of the article fulfills the title by explaining the math relevant to LLMs.

It seems like you wished the author wrote a different article that doesn't match the title.

▲

InCom-0 5 days ago | parent [-]

'The maths you need to start understanding LLMs'.

You don't need that math to start understanding LLMs. In fact, I'd argue its harmful to start there unless your goal is to 'take me on a epic journey of all the things mankind needed to figure out to make LLMs work from the absolute basics'.

	▲	5 days ago \| parent [-]
		[deleted]

▲

bryanrasmussen 5 days ago | parent | prev | next [-]

>Completely pointless to anyone who is not writing the lowest level ML libraries (so basically everyone). This does now help anyone understand how LLMs actually work.

maybe this is the target group of people who would need particular "maths" to start understanding LLMS.

▲

antegamisou 5 days ago | parent | prev | next [-]

Find someone on HN that doesn't trivialize fundamental math yet encourages everyone to become a PyTorch monkey that ends up having no idea why their models are shite: impossible.

▲

49pctber 5 days ago | parent | prev | next [-]

Anyone who would like to run an LLM would need to perform their computations on hardware. So picking hardware that is good at matrix multiplication is important for them, even if they didn't develop their LLM from scratch. Knowing the basic math also explains some of the rush to purchase GPUs and TPUs on recent years.

All that is kind of missing the point though. I think people being curious and sharpening their mental models of technology is generally a good thing. If you didn't know an LLM was a bunch of linear algebra, you might have some distorted views of what it can or can't accomplish.

▲

InCom-0 5 days ago | parent [-]

Being curious is good ... nothing wrong with that. What I took issue with above is (what I see as) attempt to derail people into low level math when that is not the crux of the question at all.

Also: nobody who wants to run LLMs will write their own matrix multiplications. Nobody doing ML / AI comes close to that stuff ... its all abstracted and not something anyone actually thinks about (except the few people who actually write the underlying libraries ie. at Nvidia).

▲

antegamisou 5 days ago | parent [-]

> attempt to derail people into low level math when that is not the crux of the question at all.

Is the barrier to entry to the ML/AI field really that low? I think no one seasoned would consider fundamental linear algebra 'low level' math.

▲

InCom-0 5 days ago | parent [-]

What do you mean 'low'? :-)

The barrier to entry is probably epicly high because to be actually useful you need to understand how to actually train a model in practice, how it is actually designed, how existing practices (ie. at OpenAI or wherever) can be built upon further ... and you need to be cutting edge at all of those things. This is not taught anywhere, you can't read about it in some book. This has absolutely nothing to do with linear algebra ... or more accurately you don't get better at those things by understanding linear algebra (or any math) better than the next guy. It is not as if 'If I were better at math, I would have been better AI researcher or programmer or whatever' :-). This is just not what these people do or how that process works. Even the foundational research that sparked rapid LLM development ('Attention Is All You Need' paper) is not some math heavy stuff. The whole thing is a conceptual idea that was tested and turned out to be spectacular.

▲

antegamisou 5 days ago | parent [-]

> 'If I were better at math, I would have been better AI researcher

This is the first time I've seen someone claim this. I don't if it's display of anti-intellectualism or plain ignorance. Otoh, most AI/ML papers' quality has deteriorated so much over the years, publications in different venues are essentially beautified PyTorch notebook by people who just play around randomly with different parameters.

	▲	InCom-0 4 days ago \| parent [-]
		Math is math. There is a very important place for it. But it is a very specific corner. LLMs don't come from math or from people who are 'mathematicians'. They use some parts of math (just like everyone everywhere). It appears to me some people have this special kind of naïveté about how foundational type of knowledge such as math gets actually used in practical applications. In practice, it just gets used (in software usually through some library) and never gets thought about again. They are not trying to invent new ways to do math, they are trying to invent AI :-).

▲

saagarjha 5 days ago | parent | prev | next [-]

If you're just piecing together a bunch of libraries, sure. But anyone who is adjacent to ML research should know how these work.

	▲	InCom-0 5 days ago \| parent [-]
		Anyone actually physically doing ML research knows it ... but doesn't write the actual code for this stuff (or god forbid write some byzantine math notations somewhere), doesn't even think about this stuff except through X levels of higher level abstractions. Also, those people understand LLMs already :-).

▲

ivape 5 days ago | parent | prev [-]

Also, people need to accept that they’ve been doing regular ass programming for many years and can’t just jump into whatever they want. The idea that developers were well rounded general engineers is a myth mostly propagated from within the bubble.

Most people’s educations right here probably didn’t even involve Linear Algebra (this is a bold claim, because the assumption is that everyone here is highly educated, no cap).