▲ | jimbokun 3 days ago | |
Why haven’t the bug AI companies been pursuing that approach, vs just ramping up context window size? | ||
▲ | menaerus 2 days ago | parent | next [-] | |
Well, we don't really know if they aren't doing exactly that for their internal code repos, right? Conceptually, there is no difference between fine-tuning the LLM for being a law expert of specific country and fine-tuning the LLM for being an expert for given codebase. Former is already happening and is public. Latter is not yet public but I believe it is happening. Reason why big co are pursuing generic LLMs is because they serve as a foundation for basically any other derivative and domain-specific work. | ||
▲ | scott_s 2 days ago | parent | prev [-] | |
Because training one family of models with very large context windows can be offered to the entire world as an online service. That is a very different business model from training or fine-tuning individual models specifically for individual customers. Someone will figure out how to do that at scale, eventually. It might require the cost of training to reduce significantly. But large companies with the resources to do this for themselves will do it, and many are doing it. |