| ▲ | dragonwriter an hour ago | |
> This is a fundamental architectural design concession; LLMs are this way as it enabled their training directly on materialscraped from the internet, rather than needing to spend trillions of dollars manually preparing separated instruction/data training material. No, its even more fundamental than that: the entire goal of broad reasoning over input data makes it impossible to have a sharp instruction/data division. The structured input that every modern chat-focussed model expects makes it very clear that they can be trained to distinguish different kinds of input, and some of those patterns now include different priority levels of instruction. | ||