Remix.run Logo
andsoitis a day ago

> While working on Cutlet, though, I allowed Claude to generate every single line of code. I didn’t even read any of the code. Instead, I built guardrails to make sure it worked correctly (more on that later).

Impressive. As a practical matter, one wonders what th point would be in creating a new programming languages if the programmer no longer has to write or read code.

Programming languages are after all the interface that a human uses to give instructions to a computer. If you’re not writing or reading it, the language, by definition doesn’t matter.

marssaxman a day ago | parent | next [-]

The constraints enforced in the language still matter. A language which offers certain correctness guarantees may still be the most efficient way to build a particular piece of software even when it's a machine writing the code.

There may actually be more value in creating specialized languages now, not less. Most new languages historically go nowhere because convincing human programmers to spend the time it would take to learn them is difficult, but every AI coding bot will learn your new language as a matter of course after its next update includes the contents of your website.

raincole a day ago | parent | next [-]

> every AI coding bot will learn your new language

If there are millions of lines on github in your language.

Otherwise the 'teaching AI to write your language' part will occupy so much context and make it far less efficient that just using typescript.

Maxatar 21 hours ago | parent | next [-]

I have not found this to be the case. My company has some proprietary DSLs we use and we can provide the spec of the language with examples and it manages to pick up on it and use it in a very idiomatic manner. The total context needed is 41k tokens. That's not trivial but it's also not that much, especially with ChatGPT Codex and Gemini now providing context lengths of 1 million tokens. Claude Code is very likely to soon offer 1 million tokens as well and by this time next year I wouldn't be surprised if we reach context windows 2-4x that amount.

The vast majority of tokens are not used for documentation or reference material but rather are for reasoning/thinking. Unless you somehow design a programming language that is just so drastically different than anything that currently exists, you can safely bet that LLMs will pick them up with relative ease.

joshstrange 20 hours ago | parent [-]

> Claude Code is very likely to soon offer 1 million tokens as well

You can do it today if you are willing to pay (API or on top of your subscription) [0]

> The 1M context window is currently in beta. Features, pricing, and availability may change.

> Extended context is available for:

> API and pay-as-you-go users: full access to 1M context

> Pro, Max, Teams, and Enterprise subscribers: available with extra usage enabled

> Selecting a 1M model does not immediately change billing. Your session uses standard rates until it exceeds 200K tokens of context. Beyond 200K tokens, requests are charged at long-context pricing with dedicated rate limits. For subscribers, tokens beyond 200K are billed as extra usage rather than through the subscription.

[0] https://code.claude.com/docs/en/model-config#extended-contex...

rebolek 20 hours ago | parent | prev | next [-]

That’s not true. I’m working on a language and LLMs have no problems writing code in it even if there exists ~200 lines of code in the language and all of them are in my repo.

calvinmorrison a day ago | parent | prev [-]

Uh not really. I am already having Claude read and then one-shot proprietary ERP code written in vintage closed source language OOP oriented BASIC with sparse documentation.... just needed to feed it in the millions of lines of code i have and it works.

jonfw 21 hours ago | parent | next [-]

I'm sure claude does great at that, but it would be objectively better, for a large variety of reasons, if claude didn't have to keep syntax examples in it's context.

calvinmorrison 19 hours ago | parent [-]

for sure. About 6 months ago it absolutely couldn't do it and kept getting cofnused even when i tried to do RAG against the manuals provided (only downloadable from a shady .ru site LOL) but now .. like butter. The context seems to mostly be it reading and writing related stuff?

vrighter a day ago | parent | prev [-]

"i haven't been able to find much" != "there isn't much on the entire internet fed into them"

andsoitis 3 hours ago | parent | prev | next [-]

> The constraints enforced in the language still matter. A language which offers certain correctness guarantees may still be the most efficient way to build a particular piece of software even when it's a machine writing the code.

I think this is right. Strategically, do you have a mental model of some key elements such a new programming language should exhibit? I'm curious about which existing programming languages might be best suited or where the opportunity is for designing something new that could throw away all the optimizations we've done for humans and instead optimize for AI programmers.

UncleOxidant a day ago | parent | prev | next [-]

> but every AI coding bot will learn your new language as a matter of course after its next update includes the contents of your website.

That's assuming that your new, very unknown language gets slurped up in the next training session which seems unlikely. Couldn't you use RAG or have an LLM read the docs for your language?

clickety_clack a day ago | parent | next [-]

Agreed - unpopular languages and packages have pretty shaky outcomes with code generation, even ones that have been around since before 2023.

almog a day ago | parent | prev [-]

Neither RAG nor loading the docs into the context window would produce any effective results. Not even including the grammar files and just few examples in the training set would help. To get any usable results you still need many many usage examples.

fcatalan 21 hours ago | parent | next [-]

My own 100% hallucinated language experiment is very very weird and still has thousands of lines of generated examples that work fine. When doing complex stuff you could see the agent bounce against the tests here and there, but never produced non-working code in the end. The only examples available were those it had generated itself as it made up the language. It was capable of making things like a JSON parser/encoder, a TODO webapp or a command line kanban tracker for itself in one shot.

marssaxman 21 hours ago | parent | prev [-]

And yet it works well enough, regardless. I have a little project which defines a new DSL. The only documentation or examples which exist for this little language, anywhere in the world, are on my laptop. There is certainly nothing in any AI's training data about it. And yet: codex has no trouble reading my repo, understanding how my DSL works, and generating code written in this novel language.

danielvaughn a day ago | parent | prev | next [-]

In addition, I think token efficiency will continue to be a problem. So you could imagine very terse programming languages that are roughly readable for a human, but optimized to be read by LLMs.

Insanity a day ago | parent | next [-]

That's an interesting idea. But IMO the real 'token saver' isn't in the language keywords but it's in the naming of things like variables, classes, etc.

There are languages that are already pretty sparse with keywords. e.g in Go you can write 'func main() string', no need to define that it's public, or static etc. So combining a less verbose language with 'codegolfing' the variables might be enough.

danielvaughn 21 hours ago | parent | next [-]

I'm not an expert in LLMs, but I don't think character length matters. Text is deterministically tokenized into byte sequences before being fed as context to the LLM, so in theory `mySuperLongVariableName` uses the same number of tokens as `a`. Happy to be corrected here.

fragmede 9 hours ago | parent [-]

Running it through https://platform.openai.com/tokenizer "mySuperLongVariableName" takes 5 tokens. "a", takes 1. mediumvarname is 3 though. "though" is 1.

coderenegade 18 hours ago | parent | prev | next [-]

You're more likely to save tokens in the architecture than the language. A clean, extensible architecture will communicate intent more clearly, require fewer searches through the codebase, and take up less of the context window.

gf000 a day ago | parent | prev [-]

Go is one of the most verbose mainstream programming languages, so that's a pretty terrible example.

Insanity 21 hours ago | parent | next [-]

Maybe not a perfect example but it’s more lightweight than Java at least haha

gf000 21 hours ago | parent [-]

If by lightweight you mean verbosity, then absolutely no.

In go every third line is a noisy if err check.

giancarlostoro 21 hours ago | parent | prev | next [-]

To you maybe, but Go is running a large amount of internet infrastructure today.

gf000 21 hours ago | parent [-]

How does that relate to Go being a verbose language?

giancarlostoro 21 hours ago | parent [-]

Its not verbose to some of us. It is explicit in what it does, meaning I don't have to wonder if there's syntatic sugar hiding intent. Drastically more minimal than equivalent code in other languages.

gf000 20 hours ago | parent [-]

Verbosity is an objective metric.

Code readability is another, correlating one, but this is more subjective. To me go scores pretty low here - code flow would be readable were it not for the huge amount of noise you get from error "handling" (it is mostly just syntactic ceremony, often failing to properly handle the error case, and people are desensitized to these blocks so code review are more likely to miss these).

For function signatures, they made it terser - in my subjective opinion - at the expense of readability. There were two very mainstream schools of thought with relation to type signature syntax, `type ident` and `ident : type`. Go opted for a third one that is unfamiliar to both bases, while not even having the benefits of the second syntax (e.g. easy type syntax, subjective but that : helps the eye "pattern match" these expressions).

giancarlostoro 20 hours ago | parent [-]

Every time I hear complaints about error handling, I wonder if people have next to no try catch blocks or if they just do magic to hide that detail away in other languages? Because I still have to do error handling in other languages roughly the same? Am I missing something?

gf000 11 hours ago | parent | next [-]

Exceptions travel up the stack on their own. Given that most error cases can't be handled immediately locally (otherwise it would be handled already and not return an error), but higher up (e.g. a web server deciding to return an error code) exceptions will save you a lot of boilerplate, you only have the throw at the source and the catch at the handler.

Meanwhile Go will have some boilerplate at every single level

Errors as values can be made ergonomic, there is the FP-heavy monadic solution with `do`, or just some macro like Rust. Go has none of these.

thunky 18 hours ago | parent | prev | next [-]

Lots of non-go code out there on the Internet if you ever decide you want to take a look.

politician 18 hours ago | parent | prev [-]

You’re not missing anything. I’ve worked with many developers that are clueless about error handling; who treat it as a mostly optional side quest. It’s not surprising that folks sees the explicit error handling in Go as a grotesque interruption of the happy path.

jurgenburgen 9 hours ago | parent [-]

That’s a pretty defensive take.

You don’t have to hate Go to agree that Rust’s `?` operator is much nicer when all you want to do is propagate the error.

LtWorf a day ago | parent | prev [-]

Well LLMs are made to be extremely verbose so it's a good match!

nineteen999 21 hours ago | parent [-]

I think there's a huge range here - ChatGPT to me seems extra verbose on the web version, but when running with Codex it seems extra terse.

Claude seems more consistently _concise_ to me, both in web and cli versions. But who knows, after 12 months of stuff it could be me who is hallucinating...

idiotsecant a day ago | parent | prev [-]

I think I remember seeing research right here on HN that terse languages don't actually help all that much

thomasmg a day ago | parent [-]

I would be very interested in this research... I'm trying to write a language that is simple and concise like Python, but fast and statically typed. My gutfeeling is that more concise than Python (J, K, or some code golfing language) is bad for readability, but so is the verbosity of Rust, Zig, Java.

rurban 5 hours ago | parent [-]

So check https://github.com/google/rune

imiric a day ago | parent | prev | next [-]

> every AI coding bot will learn your new language as a matter of course after its next update includes the contents of your website.

How will it "learn" anything if the only available training data is on a single website?

LLMs struggle with following instructions when their training set is massive. The idea that they will be able to produce working software from just a language spec and a few examples is delusional. It's a fundamental misunderstanding of how these tools work. They don't understand anything. They generate patterns based on probabilities and fine tuning. Without massive amounts of data to skew the output towards a potentially correct result they're not much more useful than a lookup table.

Zak a day ago | parent | next [-]

They don't understand anything, but they sure can repeat a pattern.

I'm using Claude Code to work on something involving a declarative UI DSL that wraps a very imperative API. Its first pass at adding a new component required imperative management of that component's state. Without that implementation in context, I told Claude the imperative pattern "sucks" and asked for an improvement just to see how far that would get me.

A human developer familiar with the codebase would easily understand the problem and add some basic state management to the DSL's support for that component. I won't pretend Claude understood, but it matched the pattern and generated the result I wanted.

This does suggest to me that a language spec and a handful of samples is enough to get it to produce useful results.

dmd a day ago | parent | prev [-]

It's wild to me the disconnect between people who actually use these tools every day and people who don't.

I have done exactly the above with great success. I work with a weird proprietary esolang sometimes that I like, and the only documentation - or code - that exists for it is on my computer. I load that documentation in, and it works just fine and writes pretty decent code in my esolang.

"But that can't possibly work [based on my misunderstanding of how LLMs work]!" you say.

Well, it does, so clearly you misunderstand how they work.

ModernMech 21 hours ago | parent | next [-]

The reason it works so well is that everyone’s “personal unique language” really isn’t all that different from what’s been proposed before, and any semantic differences are probably not novel. If you make your language C + transactional memory, the LLM probably has enough information about both to reason about your code without having to be trained on a billion lines.

Probably if you’re trying to be esoteric and arcane then yeah, you might have trouble, but that’s not normally how languages evolve.

dmd 21 hours ago | parent [-]

No, mine's a esoteric declarative data description/transform language. It's pretty damn weird.

wizzwizz4 20 hours ago | parent [-]

You may underestimate the weirdness of existing declarative data transformation languages. On a scale of 1 to 10, XSLT is about a 2 or 3.

dmd 20 hours ago | parent [-]

Mine's a weird, bad copy of Ab Initio's DML. https://www.google.com/search?q=ab+initio+dml+language

ModernMech 14 hours ago | parent [-]

When you say "weird" you mean "different from mainstream languages", but the exact way in which your language is weird (declarative data description/transformation) is probably exactly where languages will be going in the future because of how well-suited they are for LLM reading and writing. Those languages expose the structure of the computation directly such as data shapes and the relationships that transform them, rather than burying intent inside control flow.

With more explicit types and dataflow information, the model doesn't need to simulate execution (something LLMs are particularly bad at) as much as recognize and extend a transformation graph (something LLMs are particularly good at). So it's probably just that your particularly weird language is particularly well-adapted to LLM technology.

imiric 20 hours ago | parent | prev [-]

My comment is based precisely on using these tools frequently, if not daily, so what's wild is you assuming I don't.

The impact that lack of training data has on the quality of the results is easily observable. Try getting them to maintain a Python codebase vs. e.g. an Elixir one. Not just generate short snippets of code, but actually assist in maintaining it. You'll constantly run into basic issues like invalid syntax, missing references, use of nonexistent APIs, etc., not to mention more functional problems like dead, useless, or unnecessarily complicated code. I run into these things with mainstream languages (Go, Python, Clojure), so I don't see how an esolang could possibly fair any better.

But then again, the definitions of "just fine" and "decent" are subjective, and these tools are inherently unreliable, which is where I suspect the large disconnect in our experiences comes from.

quotemstr a day ago | parent | prev [-]

Those constraints can be enforced by a library too. Even humans sometimes make a whole new language for something that can be a function library. If you want strong correctness guarantees, check the structure of the library calls.

Programming languages function in large parts as inductive biases for humans. They expose certain domain symmetries and guide the programmer towards certain patterns. They do the same for LLMs, but with current AI tech, unless you're standing up your own RL pipeline, you're not going to be able to get it to grok your new language as well as an existing one. Your chances are better asking it to understand a library.

michaelbrave 7 hours ago | parent | prev | next [-]

a few months back I had a similar thought and started working on a language that was really verbose and human readable, think Cobal with influences from Swift. The core idea was that this would be a business language that business people would/could read if they needed to, so it could be used for financial and similar use cases, with built in logic engines similar to Prolog or Mercury. My idea was that once the language starts being coded by AI there are two directions to go, either we max efficiency and speed (basically let the AI code in assembly) or we lean the other way and max it for human error checking and clear outputs on how a process flows, so my theory was headed more in that direction. But of course I failed, I'd never made a programming language before (I've coded a long time, but that's not the same thing) and the AI's at the time combined with my lack of knowledge caused a spectacular failure. I still think my theory is correct though, especially if we want to see financial or business logic, having the code be more human readable to check for problems when even not a technical person, I still see a future where that is useful.

voxleone a day ago | parent | prev | next [-]

In the 90s people hoped Unified Modeling Language diagrams would generate software automatically. That mostly didn’t happen. But large language models might actually be the realization of that old dream. Instead of formal diagrams, we describe the system in natural language and the model produces the code. It reminds me of the old debates around visual web tools vs hand-written HTML. There seems to be a recurring pattern: every step up the abstraction ladder creates tension between people who prefer the new layer and those who want to stay closer to the underlying mechanics.

Roughly: machine code --> assembly --> C --> high-level languages --> frameworks --> visual tools --> LLM-assisted coding. Most of those transitions were controversial at the time, but in retrospect they mostly expanded the toolbox rather than replacing the lower layers.

One workflow I’ve found useful with LLMs is to treat them more like a code generator after the design phase. I first define the constraints, objects, actors, and flows of the system, then use structured prompts to generate or refine pieces of the implementation.

abraxas a day ago | parent [-]

I agree with the sentiment but want to point out that the biggest drive behind UML was the enrichment of Rational Software and its founders. I doubt anyone ever succeeded in implementing anything useful with Rational Rose. But the Rational guys did have a phenomenal exit and that's probably the biggest success story of UML.

I'm being slightly facetious of course, I still use sequence diagrams and find them useful. The rest of its legacy though, not so much.

spelunker a day ago | parent | prev | next [-]

Like everything generated by LLMs though, it is built on the shoulders of giants - what will happen to software if no one is creating new programming languages anymore? Does that matter?

Fnoord 18 hours ago | parent | next [-]

Without proper attribution, it seems more fair to say copyright infringement occurred, on a massive scale if I may add. The burden of proof lies at the owners of the LLM. Which is why, if you do not want a blackbox, you want training data to be properly specified. That ain't happening now because of the skeletons in the closet.

idiotsecant a day ago | parent | prev [-]

I think the only hope is that AGI arises and picks up where humanity left off. Otherwise I think this is the long dark teatime of human engineering of all sorts.

tartoran 21 hours ago | parent [-]

So you’re hoping for a blackbox uninspectable by humans? That to me sounds like a nightmare, a nightmare worse than all the cruft and stupid rules humanity accrued over time. Let’s hope the future tech is inspectable and understandable by humans.

lelanthran 5 hours ago | parent | next [-]

> So you’re hoping for a blackbox uninspectable by humans?

We already have that. He's hoping that the blackbox gets smart enough to understand itself.

idiotsecant 19 hours ago | parent | prev [-]

I think if we assume that AGI will be a thing the odds of future tech remaining inspectable by humans is pretty unlikely. Would you build a car so that your dog can maintain it?

tartoran 12 hours ago | parent [-]

Fully understandable end to end by any normal human and inspectable enough for human governance are different things. In any sane world, AGI would be built inside a human institutional environment: laws, audits, liability, safety engineering, access controls, operational constraints, etc. We do not build planes so passengers can reconstruct the turbine from scratch, but we still require them to be inspectable by the people responsible for certifying/repairing them. The right standard is not whether an average person can rebuild or fully undestand the whole machine, but whether human institutions can reliably inspect, verify and govern it. If they can’t, then the technology is not mature enough to trust.

_aavaa_ a day ago | parent | prev | next [-]

I don’t agree with the idea that programming languages don’t have an impact of an LLM to write code. If anything, I imagine that, all else being equal, a language where the compiler enforces multiple levels of correctness would help the AI get to a goal faster.

phn a day ago | parent | next [-]

A good example of this is Rust. Rust is by default memory safe when compared to say, C, at the expense of you having to be deliberate in managing memory. With LLMs this equation changes significantly because that harder/more verbose code is being written by the LLM, so it won't slow you down nearly as much. Even better, the LLM can interact with the compiler if something is not exactly as it should.

On a different but related note, it's almost the same as pairing django or rails with an LLM. The framework allows you to trust that things like authentication and a passable code organization are being correctly handled.

munksbeer 15 hours ago | parent [-]

I was under the impression from Rust developers that it was one of the languages LLMs struggled with a bit more than others? My view could be (probably is) very outdated.

hombre_fatal 5 hours ago | parent [-]

If it were ever true, it's not anymore.

Rust is a nice choice even just for its amazing sum types and the ability to make impossible states unrepresentable at the type level.

jetbalsa a day ago | parent | prev [-]

That is why Typescript is the main one used by most people vibe coding, The LLMs do like to work around the type engine in it sometimes, but strong typing and linting can help a ton in it.

onlyrealcuzzo a day ago | parent | prev | next [-]

> Impressive. As a practical matter, one wonders what th point would be in creating a new programming languages if the programmer no longer has to write or read code.

I'm working on a language as well (hoping to debut by end of month), but the premise of the language is that it's designed like so:

1) It maximizes local reasoning and minimizes global complexity

2) It makes the vast majority of bugs / illegal states impossible to represent

3) It makes writing correct, concurrent code as maximally expressive as possible (where LLMs excel)

4) It maximizes optionality for performance increases (it's always just flipping option switches - mostly at the class and function input level, occassionaly at the instruction level)

The idea is that it should be as easy as possible for an LLM to write it (especially convert other languages to), and as easy as possible for you to understand it, while being almost as fast as absolutely perfect C code, and by virtue of the design of the language - at the human review phase you have minimal concerns of hidden gotcha bugs.

idiotsecant a day ago | parent [-]

How does a programming language prevent the vast majority of bugs? I feel like we would all be using that language!

onlyrealcuzzo 21 hours ago | parent | next [-]

See Rust with Use-after-Free, fearless concurrency, etc.

My language is a step ahead of Rust, but not as strict as Ada, while being easier to read than Swift (especially where concurrency is involved).

rurban 5 hours ago | parent [-]

And whenever someone tells you to lookup "Fearless concurrency", replace it internally with "Locking concurrency". Thanks to the Rust marketing department

onlyrealcuzzo 3 hours ago | parent [-]

When you combine that with guaranteed lock elision when you're doing things as expected, it's not a problem.

Chaosvex 10 hours ago | parent | prev | next [-]

How? That's easy. You just need a huge dollop of hubris.

gf000 a day ago | parent | prev [-]

I agree with your questioning of it being capable of preventing bugs, but your second point is quite likely false -- we have developed a bunch of very useful abstractions in "research" languages 50 years ago, only to re-discover them today (no null, algebraic data types, pattern matching, etc).

johnfn a day ago | parent | prev | next [-]

> If you’re not writing or reading it, the language, by definition doesn’t matter.

By what definition? It still matters if I write my app in Rust vs say Python because the Rust version still have better performance characteristics.

johnbender a day ago | parent | prev | next [-]

In principle (and we hope in practice) the person is still responsible for the consequences of running the code and so it remains important they can read and understand what has been generated.

koolala a day ago | parent | prev | next [-]

Saves tokens. The main reason though is to manage performance for what techniques get used for specific use cases. In their case it seems to be about expressiveness in Bash.

andyfilms1 a day ago | parent | prev | next [-]

I've been wondering if a diffusion model could just generate software as binary that could be fed directly into memory.

entropie a day ago | parent [-]

Yeah, what could go wrong.

eatsyourtacos 21 hours ago | parent | prev [-]

I have been building a game via a separate game logic library and Unity (which includes that independent library).. let's just say that over the last couple weeks I have 100% lost the need to do the coding myself. I keep iterating and have it improve and there are hundreds of unit tests.. I have a Unity MCP and it does 95% of the Unity work for me. Of course the real game will need custom designing and all that; but in terms of getting a complete prototype setup.... I am literally no longer the coder. I just did in a week what it would have taken me months and months and months to do. Granted Unity is still somewhat new to, but still.. even if you are an expert- it can immediately look at all your game objects and detect issues etc.

So yeah for some things we are already at the point of "I am not longer the coder, I am the architect".. and it's scary.

nineteen999 20 hours ago | parent [-]

100% same experience with Claude and Unreal Engine 5 over here. And as the game moves from "less scaffolding" towards "more code", Claude actually is getting better at one-shotting things than it ever was - probably due to there being a lot more examples in the codebase of how to handle things under different scenarious (world compositing, multiplayer etc etc).