▲ | HarHarVeryFunny 5 days ago | |||||||||||||||||||||||||
It depends on what level of understanding, and who you are talking about. For the 99% of people outside of software development or machine learning, it is totally irrelevant, as is any details of the Transformer architecture, or the mechanism by which a trained Transformer operates. For the man in the street, inclined to view "AI" as some kind of artificial brain or sentient thing, the best explanation is that basically it's just matching inputs to training samples and regurgitating continuations. Not totally accurate of course, but for that audience at least it gives a good idea and is something they can understand, and perhaps gives them some insight into what it is, how it works/fails, and that it is NOT some scary sentient computer thingy. For anyone in the remaining 1% (or much less - people who actually understand ANNs and machine learning), then learning about the Transformer architecture and how a trained Transformer works (induction heads etc) is what they need to learn to understand what an (Transformer-based, vs LSTM-based) LLM is and how it works. Knowing about the "math" of Transformers/ANNs is only relevant to people who are actually implementing them from ground up, not even those who might just want to build one using PyTorch or some other framework/lbrary where the math has already been done for you. Finally, embeddings aren't about math - they are about representation, which is certainly important to understanding how Transformers and other ANNs work, but still a different topic. * US population of ~300M has ~1M software developers, of which a large fractions are going to be doing things like web development and only at a marginal advantage over someone smart outside of development in terms of learning how ANNs/etc work. | ||||||||||||||||||||||||||
▲ | gpjt 5 days ago | parent | next [-] | |||||||||||||||||||||||||
Post author here. I agree 100%! The post is the basic maths for people digging in to how LLMs work under the hood -- I wrote a separate one for non-techies who just want to know what they are, at https://www.gilesthomas.com/2025/08/what-ai-chatbots-are-doi... | ||||||||||||||||||||||||||
▲ | ants_everywhere 5 days ago | parent | prev [-] | |||||||||||||||||||||||||
I agree that most people don't need to understand the mathematics or design of the transformer architecture, but that isn't a good description of what LLMs do from a technical perspective. Someone with that mental model would be worse off than someone who had no mental model at all and just used it as a black box. | ||||||||||||||||||||||||||
|