Remix clone Hacker News

new | show | ask | jobs Github

	▲	ph4rsikal 2 days ago
		It might appear so, but then you could validate it with a simple test. If the LLM would play a 4x4 Tic Tac Toe game, would the agent select the winning move 100% of all time or block a losing move 100% of the time? If these systems were capable of proper reasoning, then they would find the right choice in these obvious but constantly changing scenarios without being specifically trained for it. [1] https://jdsemrau.substack.com/p/nemotron-vs-qwen-game-theory...