Remix.run Logo
starchild3001 2 days ago

The distinction Karpathy draws between "growing animals" and "summoning ghosts" via RLVR is the mental model I didn't know I needed to explain the current state of jagged intelligence. It perfectly articulates why trust in benchmarks is collapsing; we aren't creating generally adaptive survivors, but rather over-optimizing specific pockets of the embedding space against verifiable rewards.

I’m also sold on his take on "vibe coding" leading to ephemeral software; the idea of spinning up a custom, one-off tokenizer or app just to debug a single issue, and then deleting it, feels like a real shift.

HarHarVeryFunny 2 days ago | parent | next [-]

> The distinction Karpathy draws between "growing animals" and "summoning ghosts" via RLVR

I don't see these descriptions as very insightful.

The difference between general/animal intelligence and jagged/LLM intelligence is simply that humans/animals really ARE intelligent (the word was created to describe this human capability), while LLMs are just echoing narrow portions of the intelligent output of humans (those portions that are amenable to RLVR capture).

For an artificial intelligence to be intelligent in it's own right, and therefore be generally intelligent, it would need to need - like an animal - to be embodied (even if only virtually), autonomous, predicting the outcomes of it's own actions (not auto-regressively trained), learning incrementally and continually, built with innate traits like curiosity and boredom to put and keep itself in learning situations, etc.

Of course not all animals are generally intelligent - many (insects, fish, reptiles, many birds) just have narrow "hard coded" instinctual behaviors, but others like humans are generalists who evolution have therefore honed for adaptive lifetime learning and general intelligence.

naasking a day ago | parent [-]

> while LLMs are just echoing narrow portions of the intelligent output of humans

But they aren't just echoing, that's the point. You really need to stop ignoring the extrapolation abilities in these domains. The point of the jagged analogy is that they match or exceed human intelligence in specific areas in a way that is not just parroting.

HarHarVeryFunny 15 hours ago | parent [-]

It's tiresome in 2025 to keep on having to use elaborate long winded descriptions to describe how LLMs work, just to prove that one does understand, rather than be able to assume that people generally understand, and be able to use shorter descriptions.

Would "riffing" upset you less than "echoing"? Or an explicit "echoing statistics" rather than "echoing training samples"? Does "Mashups of statistical patterns" do it for you?

The jagged frontier of LLM capabilty is just a way of noting the fact that they act more like a collection of narrow intelligences rather than a general intelligence who's performance might be expected to be more even.

Of course LLMs are built and trained to generate based on language statistics, not to parrot individual samples, but given your objection it's amusing to note that some of the areas where LLMs do best, such as math and programming, are the ones where they have been RL-trained to override these more general language patterns and instead more closely follow the training data.

graemefawcett 2 days ago | parent | prev | next [-]

I've been doing it for months, it's lovely

https://tech.lgbt/@graeme/115749759729642908

It's a stack based on finishing the job Jupyter started. Fences as functions, callable and composable.

Same shape as an MCP. No training required, just walk them through the patterns.

Literally, it's spatially organized. Turns out a woman named Mrs Curwen and I share some thoughts on pedagogy.

There does in fact exist a functor that maps 18th century piano instruction to context engineering. We play with it

fourside 2 days ago | parent | prev [-]

> I’m also sold on his take on "vibe coding" leading to ephemeral software; the idea of spinning up a custom, one-off tokenizer or app just to debug a single issue, and then deleting it, feels like a real shift.

We should keep in mind that currently our LLM use is subsidized. When the money dries up and we have to pay the real prices I’ll be interested to see if we can still consider whipping up one time apps as basically free