| ▲ | moffkalast 6 hours ago | |
That's what Meta thought initially too, training codellama and chat llama separately, and then they realized they're idiots and that adding the other half of data vastly improves both models. As long as it's quality data, more of it doesn't do harm. Besides, programming is far from just knowing how to autocomplete syntax, you need a model that's proficient in the fields that the automation is placed in, otherwise they'll be no help in actually automating it. | ||
| ▲ | theshrike79 3 hours ago | parent [-] | |
But as far as I know, that was way before tool calling was a thing. I'm more bullish about small and medium sized models + efficient tool calling than I'm about LLMs too large to be run at home without $20k of hardware. The model doesn't need to have the full knowledge of everything built into it when it has the toolset to fetch, cache and read any information available. | ||