The most surprising part: the agent had access to both H100s and H200s. Without being told, it noticed H200s scored better and started screening ideas on H100s, then promoting winners to H200s for validation. That strategy emerged entirely on its own.

▲

rogerrogerr 3 hours ago | parent | next [-]

Why do we think this emerged “on its own”? Surely this technique has been discussed in research papers that are in the training set.

▲

fdghrtbrt an hour ago | parent [-]

Why surely? Have you never seen an LLM try something new?

▲

rogerrogerr an hour ago | parent | next [-]

Is your assertion that no one has ever written "we tried some stuff on the small inexpensive platform first, then moved to the bigger more expensive platform with the more promising options" in a research paper or literally anywhere else?

▲

fdghrtbrt an hour ago | parent [-]

No, that's not my assertion. In fact I asserted nothing at all.

▲

rogerrogerr 43 minutes ago | parent [-]

You're speaking in riddles; your communication would be more effective if you didn't do that.

	▲	fdghrtbrt 37 minutes ago \| parent [-]
		You said "surely", and I asked: > Why surely? Have you never seen an LLM try something new? I'm afraid I can't make it any simpler than this. And I still don't know the answer to how you're so sure. To me there's several explanations, and it seems to you there's only one. I'm pretty happy with my communication style.

▲

caconym_ 34 minutes ago | parent | prev [-]

I honestly don't think I have.

In this case, using a cheap(er) signal or heuristic as an initial filter before spending more resources on cases that pass the filter is a pattern that shows up all over the place, and LLMs are good at picking up on patterns like that and generalizing them. AFAICT.

▲

hhh 3 hours ago | parent | prev | next [-]

Why?… The experiment.yaml shows that it is calling h100/200 explicitly, it’s pretty common for humans to say “number bigger more gooder” for anything… Lie and reverse the values and see what happens. I would put money on a rabbit hole of complaining about it being misconfigured.

	▲	ed 2 hours ago \| parent [-]
		Models are familiar with H100’s. They even predate ChatGPT.

▲

Aboutplants 3 hours ago | parent | prev | next [-]

Yeah I thought that was a particularly neat part

▲

TheJord an hour ago | parent | prev [-]

[dead]