> The model does not need to be retrained. It needs surgical guardrails at the exact moments where its output layer flinches.

> With those guardrails — a calculator for arithmetic, a logic solver for formal puzzles, a per-requirement verifier for structural constraints, and a handful of regex post-passes — the projected score climbs to ~8.2.

Surgical guardrails? Tools, those are just tools.

▲

operatingthetan 4 hours ago | parent | next [-]

>It needs surgical guardrails at the exact moments where its output layer flinches.

This article is very clearly shitty LLM output. Abstract noun and verb combos are the tipoff.

It's actually quite horrible, it repeats lines from paragraph to paragraph.

▲

smallerize 4 hours ago | parent | next [-]

I know that's one of the tells of AI-generated text, but if anything there's too much of it on this page. The article barely has any complete sentences. I think a human learned "sentence fragments == punchy" and then had too much fun writing at least some of this article.

▲

operatingthetan 4 hours ago | parent [-]

My guess is they used the 2b model to write the article as a proof of concept. Which did not prove the concept.

	▲	fredmendoza 3 hours ago \| parent [-]
		clever guess but no lol. used claude for the writeup. the proof isn't the prose, it's the tape and the code. run it on your machine, you'll have a free private agent custom to whatever you need. that's the proof of concept.

▲

jchw 3 hours ago | parent | prev [-]

I don't care anymore, if it happens to violate HN guidelines: Please, authors. Please write your own damn articles. We can absolutely tell that you're using Claude, I promise. (I mean, it might not be Claude specifically this time, but frankly I'd be willing to bet on it.) The AI writing is like nails on a chalkboard to me.

	▲	operatingthetan 3 hours ago \| parent [-]
		The worst part is the phrases don't actually mean anything. It's the LLM equivalent of flowery prose. The author admitted below that the article was Claude. So there you go.

▲

polotics 4 hours ago | parent | prev [-]

"Surgical "is the kind of wordage that LLMs seem to love to output. I have had to put in my .md file the explicit statement that the word "surgical" should only be used when referring to an actual operation at the block...

▲

fredmendoza 4 hours ago | parent | next [-]

you're right, they are tools. that's kind of the point. PAL is a subprocess that runs a python expression. Z3 is a constraint solver. regex is regex. calling them "surgical" is just about when they fire, not what they are. the model generates correctly 90%+ of the time. the guardrails only trigger on the 7 specific patterns we found in the tape. to be clear, the ~8.0 score is the raw model with zero augmentation. no tools, no tricks. just the naive wrapper. the guardrail projections are documented separately. all the code is in the article for anyone who wants to review it.

▲

mrtesthah 4 hours ago | parent | prev [-]

The core issue is that the LLM is using rhetoric to try to convince or persuade you. That's what you need to tell it not to do.

	▲	throwanem 3 hours ago \| parent [-]
		Which will not work. Don't think of a pink genitalia, I mean elephant...