Remix clone Hacker News

new | show | ask | jobs Github

	▲	juleiie 2 hours ago
		Honestly if that was some kind of research paper, it would be wholly insufficient to support any safety thesis. They even admit: "[...]our overall conclusion is that catastrophic risks remain low. This determination involves judgment calls. The model is demonstrating high levels of capability and saturates many of our most concrete, objectively-scored evaluations, leaving us with approaches that involve more fundamental uncertainty, such as examining trends in performance for acceleration (highly noisy and backward-looking) and collecting reports about model strengths and weaknesses from internal users (inherently subjective, and not necessarily reliable)." Is this not just an admission of defeat? After reading this paper I don't know if the model is safe or not, just some guesses, yet for some reason catastrophic risks remain low. And this is for just an LLM after all, very big but no persistent memory or continuous learning. Imagine an actual AI that improves itself every day from experience. It would be impossible to have a slightest clue about its safety, not even this nebulous statement we have here. Any sort of such future architecture model would be essentially Russian roulette with amount of bullets decided by initial alignment efforts.