▲ | orderone_ai 10 hours ago | |
Thank you for the question! I would say that ease of use and deployment is actually a good reason to have a single model. We don't train 20 LLMs for different purposes - we train one (or, I guess 3-4 in practice, each with their own broad specialization), and then prompt it for different tasks. This simplifies deployment, integration, upgrading, etc. This model is basically the same - instead of having a restriction to doing single-task classification. This means that a user can complete new tasks using a new prompt, not a new model. | ||
▲ | throwawayffffas 10 hours ago | parent [-] | |
While I agree with the general reasoning, isn't it harder for the user to prompt the model correctly as opposed to selecting a specialized model that they wish to use? That's the feeling I have when I try to use LLMs for more general language processing. Have you run in cases where the model "forgets" the task at hand and switches to another mid text stream? Regardless of all of the above. It looks to me that your choice of reasoning and problem solving in the latent space is a great one and where we should be collectively focusing our efforts, keep up the good work. |