▲ | lolinder 10 days ago | |||||||||||||||||||||||||
One of the consistent problems I'm seeing over and over again with LLMs is people forgetting that they're limited by the training data. Software engineers get hyped when they see the progress in AI coding and immediately begin to extrapolate to other fields—if Copilot can reduce the burden of coding so much, think of all the money we can make selling a similar product to XYZ industries! The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on. We've spent the last 20+ years writing millions and millions of lines of code that we published on the internet, not to mention answering questions on Stack Overflow (which still has 3x as many answers as all other Stack Exchanges combined [0]), writing technical blogs, hundreds of thousands of emails in public mailing lists, and so on. Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do. Ethics of the mass harvesting aside, it's simply not possible for an LLM to have the same skill level in ${insert industry here} as they do with software, so you can't extrapolate from Copilot to other domains. | ||||||||||||||||||||||||||
▲ | steveBK123 10 days ago | parent | next [-] | |||||||||||||||||||||||||
Yes this is EXACTLY it, and I was discussing this a bit at work (financial services). In software, we've all self taught, improved, posted Q&A all over the web. Plus all the open source code out there. Just mountains and mountains of free training data. However software is unique in being both well paying and something with freely available, complete information online. A lot of the rest of the world remains far more closed and almost an apprenticeship system. In my domain thinks like company fundamental analysis, algo/quant trading, etc. Lots of books you can buy from the likes of Dalio, but no real (good) step by step research and investment process information online. Likewise I'd imagine heavily patented/regulated/IP industries like chip design, drug design, etc are substantially as closed. Maybe companies using an LLM on their own data internally could make something of their data, but its also quite likely there is no 'data' so much as tacit knowledge handed down over time. | ||||||||||||||||||||||||||
▲ | rm445 9 days ago | parent | prev | next [-] | |||||||||||||||||||||||||
Many other industries haven't yet been fully eaten by software. All kinds of data is locked away and in proprietary formats, and is generated by humans without much automation. I don't think we know where exactly the frontiers are, once someone puts in the work to build large datasets, and automates creation of synthetic training data. Whole industries could suddenly flip from 'impossible' to 'easy' for AI. | ||||||||||||||||||||||||||
▲ | mountainriver 10 days ago | parent | prev | next [-] | |||||||||||||||||||||||||
Yep, this is also the reason LLMs can probably work well for a lot more things if we did have the data | ||||||||||||||||||||||||||
▲ | unoti 10 days ago | parent | prev [-] | |||||||||||||||||||||||||
>The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on... millions of lines of code that we published on the internet... > Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do. You are correct! There's lots of information available publicly about certain things like code, and writing SQL queries. But other specialized domains don't have the same kind of information trained into the heart of the model. But importantly, this doesn't mean the LLM can't provide significant value in these other more niche domains. They still can, and I provide this every day in my day job. But it's a lot of work. We (as AI engineers) have to deeply understand the special domain knowledge. The basic process is this: 1. Learn how the subject matter experts do the work. 2. Teach the LLM to do this, using examples, giving it procedures, walking it through the various steps and giving it the guidance and time and space to think. (Multiple prompts, recipes if you will, loops, external memory...) 3. Evaluation, iteration, improvement 4. Scale up to production In many domains I work in, it can be very challenging to get past step 1. If I don't know how to do it effectively, I can't guide the LLM through the steps. Consider an example question like "what are the top 5 ways to improve my business" -- the subject matter experts often have difficulty teaching me how to do that. If they don't know how to do it, they can't teach it to me, and I can't teach it to the agent. Another example that will resonate with nerds here is being an effective Dungeons and Dragons DM. But if I actually learn how to do it, and boil it down into repeatable steps, and use GraphRAG, then it becomes another thing entirely. I know this is possible, and expect to see great things in that space, but I estimate it'll take another year or so of development to get it done. But in many domains, I get access to subject matter experts that can tell me pretty specifically how to succeed in an area. These are the top 5 situations you will see, how you can identify which situation type it is, and what you should do when you see that you are in that kind of situation. In domains like this I can in fact make the agent do awesome work and provide value, even when the information is not in the publicly available training data for the LLM. There's this thing about knowing a domain area well enough to do the job, but not having enough mastery to teach others how to do the job. You need domain experts that understand the job well enough to teach you how to do it, and you as the AI engineer need enough mastery over the agent to teach it how to do the job as well. Then the magic happens. When we get AGI we can proceed past this limitation of needing to know how to do the job ourselves. Until we get AGI, then this is how we provide impact using agents. This is why I say that even if LLM technology does not improve any more beyond where it was a year ago, we still have many years worth of untapped potential for AI. It just takes a lot of work, and most engineers today don't understand how to do that work-- principally because they're too busy saying today's technology can't do that work rather than trying to learn how to do it. | ||||||||||||||||||||||||||
|