| ▲ | cflewis 4 hours ago | |
Yes, it can do. At Big Tech Company I Work At the LLM is quite happy to make raw API calls. If it thinks the data is big, then it'll write a Python tool to do it. The reason crafted backing CLIs are useful is you can guide the LLM towards stuff that is immediately useful rather than hoping the nondetermism can separate the wheat from the chaff. Take CI: is it interesting to know which tests passed? Maybe, but probably not. What is really interesting is what failed. Instead of having the LLM go out and talk directly to the CI system, write an intermediate CLI that filters out less actionable stuff by default, and have a flag that'll deliver the full dump if necessary. It's a skill to do this stuff, and it's a lot of hard won experience than something I think is easily teachable. You kind of have to feel out your model and how it "thinks" about solving problems. And then a new model version comes out and you have to learn it all again! | ||