Remix clone Hacker News

new | show | ask | jobs Github

	▲	thepasch 10 hours ago
		Very true, and something worth mentioning. Papers that tried eliciting introspective language from base models with no post-training have largely failed to find any patterns or activations that look similar to those found in instruct models when prompted for the same thing. I did sort of touch on it in the "what does this mean" section: > post-training installs a self-model with actual, meaningful boundaries, and when processing falls outside those boundaries, the first-person pronoun no longer binds to the content. But you're right I could've been more explicit about it.
	▲	cadamsdotcom 10 hours ago \| parent [-]
		Yep. Self-awareness is only useful for embodied organisms that exist in a social context. Detection of errors injected into context is useful but I think it’s a different thing.