| ▲ | nateb2022 a day ago | |||||||
> So if you get your target to record (say) 1 hour of audio, that's a one-shot. No, that would still be zero shot. Providing inference-time context (in this case, audio) is no different than giving a prompt to an LLM. Think of it as analogous to an AGENTS.md included in a prompt. You're not retraining the model, you're simply putting the rest of the prompt into context. If you actually stopped and fine-tuned the model weights on that single clip, that would be one-shot learning. | ||||||||
| ▲ | ImPostingOnHN 14 hours ago | parent [-] | |||||||
> Providing inference-time context (in this case, audio) is no different than giving a prompt to an LLM. Right... And you have 0-shot prompts ("give me a list of animals"), 1-shot prompts ("give me a list of animals, for example: a cat"), 2-shot prompts ("give me a list of animals, for example: a cat; a dog"), etc. The "shot" refers to how many examples are provided to the LLM in the prompt, and have nothing to do with training or tuning, in every context I've ever seen. | ||||||||
| ||||||||