| ▲ | kevin_thibedeau 4 days ago | |
The current usage model comingles commands and data. That doesn't have to be the case. Use an input format that explicitly presents them as separate components parsed into a data structure with non-LLM tooling. Or stick with natural language input but parse into an intermediate format that can be verified to some standard of correctness. | ||
| ▲ | jcgl 2 hours ago | parent [-] | |
I’m no expert, but as long as they’re represented by tokens in the end, they’re just tokens. Even if you train the transformer to treat them specially, a token is a token, and there’s no free lunch. At best, you’re going to be trading off between paying attention to this would-be security boundary and delivering high-quality results; the more you focus on one, the more you lose on the other. | ||