Remix.run Logo
sowbug 4 days ago

Waaay off topic, but does anyone know why LLMs don't have poor grammar if they were trained on the average/poor grammar of the internet? Why don't they mix up then/than or it's/its, or use hypercorrections like "from you and I"?

(Update: of course I had to ask my friendly neighborhood LLM, and the answer is the correct usages still dominate incorrect ones, so statistics favor correctness. They down-weight low quality sources (comments like mine) and up-weight high quality ones (published books, reputable news sites). Then human reinforcement learning adds further polish.)

omneity 4 days ago | parent | next [-]

Two phenomena at play, correct spellings tend to be the most common on aggregate in a large enough dataset so there’s a bias, and the finetuning step (Instruct SFT) helps the model hone down on what it should use from the set of all possible formulations it saw in pretraining.

This is why LLMs can still channel typos or non/standard writing when you ask them to write in such a style for example.

tuetuopay 4 days ago | parent | prev [-]

I would also expect a grammar phase to be part of the training, with an RL pass where the output is fed to a grammar check engine.