▲ | blueblisters 5 days ago | |
Also the pretrained LLM (the one trained to predict next token of raw text) is not the one that most people use A lot of clever LLM post training seems to steer the model towards becoming excellent improv artists which can lead to “surprise” if prompted well |