Remix.run Logo
coder543 a day ago

To me, a closer analogy is In Context Learning.

In the olden days of 2023, you didn’t just find instruct-tuned models sitting on every shelf.

You could use a base model that has only undergone pretraining and can only generate text continuations based on the input it receives. If you provided the model with several examples of a question followed by an answer, and then provided a new question followed by a blank for the next answer, the model understood from the context that it needed to answer the question. This is the most primitive use of ICL, and a very basic way to achieve limited instruction following behavior.

With this few-shot example, I would call that few-shot ICL. Not zero shot, even though the model weights are locked.

But, I am learning that it is technically called zero shot, and I will accept this, even if I think it is a confusingly named concept.