Remix.run Logo
fcatalan 4 hours ago

I have let Gemini, Claude Code and Codex hallucinate the language they wanted to for a few days. I prompted for "design the language you'd like to program in" and kept prompting "go ahead". Just rescued it from a couple too deep rabbit holes or asked it for some particular examples to stress it a bit.

It´s a weird-ass Forth-like but with a strong type system, contracts, native testing, fuzz testing, and a constraint solver for integer math backed by z3. Interpreter implemented in Elixir.

In about 150 commits, everything it has done has always worked without runtime errors, both the Elixir interpreter and the examples in the hallucinated language, some of them non-trivial for a week old language (json parser, DB backed TODO web app).

It´s a deranged experiment, but on the other hand seems to confirm that "compile" time analysis plus extensive testing facilities do help LLM agents a lot, even for a weird language that they have to write just from in-context reference.

Don´t click if you value your sanity, the only human generated thing there is the About blurb:

https://github.com/cairnlang/Cairn

gf000 2 hours ago | parent | next [-]

Interesting project, but I believe the base assumption is already slightly wrong. Why do we assume that LLMs know what kind of language would benefit them? This information is not knowable without doing proper research, and even if there is some research like that, it would have to be a part of the training data. Otherwise it's just hallucination.

fcatalan 2 hours ago | parent [-]

I agree, it´s mostly a silly whim taken too far. Too much time in my hands.

In particular the whole stack based thing looks questionable.

In fact the very first answer by Gemini proposed an APL-like encoding of the primitives for token saving, but when I started the implementation Claude Code pushed back on that, saying it would need to keep some sane semantics around the keywords to be able to understand the programs.

The very strict verification story seems more plausible, tracks with the rest of the comments here.

What has surprised me is that the language works at all, adding todo items to a web app written in a week old language felt a bit eery.

ntonozzi 3 hours ago | parent | prev | next [-]

Wow that is wild, that is exactly along the lines of my fantasy language. It'd be so easy to go into the deep end building tooling and improving a language like this.

fcatalan 3 hours ago | parent [-]

I have had to check myself a bit, too easy to fall too deep into what is essentially a practical joke

zozbot234 2 hours ago | parent | prev | next [-]

This is actually quite impressive, especially as AI vibe-coded slop. How easy is the language to learn for novice coders, compared to other FORTH lookalikes?

fcatalan 2 hours ago | parent [-]

There's a lot of language for such a little time, but if you have programmed any Forth it should be easy to pick up, have a look at some of the top level examples.

I have programmed about 3 Forth implementations by hand throughout the years for fun, but I have never been able to really program in it, because the stack wrangling confuses me enormously.

So for me anything vaguely complex is unreadable , but apparently not for the LLMs, which I find surprising. When I have interrogated them they say they like the lack of syntax more than the stack ops hamper them, but it might be just an hallucinated impression.

When they write Cairn I sometimes see stack related error messages scroll by, but they always correct them quickly before they stop.

adregan 4 hours ago | parent | prev [-]

Have you asked them to compile it to BEAM bytecode directly?

fcatalan 4 hours ago | parent [-]

It has been on the roadmap since they invented the thing. I fear it won't work but then they probably will do it in 10 minutes...