Remix clone Hacker News

new | show | ask | jobs Github

	▲	mentalgear 2 hours ago
		I think therein lies another fun benchmark to show that LLM don't generalize: ask the llm to solve the same logic riddle, only in different languages. If it can solve it in some languages, but not in others, it's a strong argument for just straightforward memorization and next token prediction vs true generalization capabilities.