Remix.run Logo
protocolture 5 hours ago

I have always had concerns about physical robots making my work less safe in the real world.

But had never considered that a programming language might be created thats less human readable/auditable to enable LLMs.

Scares me a bit.

make3 3 hours ago | parent [-]

LLMs in their current form rely heavily on the vast amount of human data that's available, to learn from it as a first step (the second step is RL).

We're not building a language for LLMs just yet.

jaggederest 2 hours ago | parent | next [-]

> We're not building a language for LLMs just yet.

Working on it, actually! I think it's a really interesting problem space - being efficient on tokens, readable by humans for review, strongly typed and static for reasoning purposes, and having extremely regular syntax. One of the biggest issues with symbols is that, to a human, matching parentheses is relatively easy, but the models struggle with it.

I expect a language like the one I'm playing with will mature enough over the next couple years that models with a knowledge cutoff around 1/2027 will probably know how to program it well enough for it to start being more viable.

One of the things I plan to do is build evals so that I can validate the performance of various models on my as yet only partially baked language. I'm also using only LLMs to build out the entire infrastructure, mostly to see if it's possible.

quinnjh 2 hours ago | parent [-]

do you expect the model to train on synthetic data or do you expect to grow a userbase that will generate organic training data?

> One of the biggest issues with symbols is that, to a human, matching parentheses is relatively easy, but the models struggle with it.

Great point. I find it near trivial to close parens but llms seem to struggle with the lisps ive played with because of this counting issue. To the point where ive not been working with them as much. typescript and functional js as other commentors note is usually smooth sailing.

jaggederest 2 hours ago | parent [-]

> do you expect the model to train on synthetic data or do you expect to grow a userbase that will generate organic training data?

Both, essentially, I expect the code examples to grow organically but I expect most of them to come from LLMs, after all, that's the point of the language. I basically expect there to be a step function in effectiveness when the language has been ingested by the models, but they're already plenty decent-ish right now at it.

The most fascinating thing to me, generating the whole thing, has been that the LLMs are really, really good at iterating in a tight loop by updating the interpreter with new syntax, updating the stdlib with that new syntax, building some small extension to try using it, and then surfacing the need for a new builtin or primitive to start the cycle over.

I'm also leaning heavily on Chatgpt-5.2's insanely good math skills, and the language I'm building is very math heavy - it's essentially a distant cousin to Idris or any of the other dependently-typed theorem proving languages.

energy123 2 hours ago | parent | prev [-]

It's worth asking why we haven't had the AlphaZero moment for general learning yet, where no human data is needed.