Remix.run Logo
munchler a day ago

> A model that aces benchmarks but doesn't understand human intent is just less capable. Virtually every task we give an LLM is steeped in human values, culture, and assumptions. Miss those, and you're not maximally useful. And if it's not maximally useful, it's by definition not AGI.

This ignores the risk of an unaligned model. Such a model is perhaps less useful to humans, but could still be extremely capable. Imagine an alien super-intelligence that doesn’t care about human preferences.

tomalbrc a day ago | parent [-]

Except that it is not anything remotely alien but completely and utterly human, being trained on human data.

munchler a day ago | parent | next [-]

Fine, then imagine a super-intelligence trained on human data that doesn’t care about human preferences. Very capable of destroying us.

pixl97 21 hours ago | parent | prev [-]

>but completely and utterly human, being trained on human data.

For now. As AI become more agentic and capable of generating its own data we can quickly end up with drift on human values. If models that drift from human values produce profits for their creators you can expect the drift to continue.