Remix.run Logo
xiphias2 3 hours ago

I'm actually excited for somebody trying experimenting with automated translation, but I'm afraid this will be lots of backwards compatibility issues.

I started looking at the commits, and it's basically solving the ,,tests not pass'' problem by changing the tests themselves. The real work of making it working on programs that are already deployed will be just starting now.

The only silver lining I see is that the server side JS community for some reason is already used to breakages all the time.

rohitpaulk 22 minutes ago | parent | next [-]

The whole idea that my RUNTIME contains code that a single human hasn't looked at does make me uncomfortable, but if this actually works without a ton of issues it's pretty remarkable.

tarruda 2 hours ago | parent | prev | next [-]

> I started looking at the commits, and it's basically solving the ,,tests not pass'' problem by changing the tests themselves

Not sure if these decisions were made by the LLM, but I've always felt that Claude is more prone to doing "shady stuff" like modifying tests than finding correct solutions to problems.

GPT/Codex is more honest in this regard.

InsideOutSanta 2 hours ago | parent [-]

Yeah, Claude is very creative in finding ways of "solving" problems that go against what the user probably intended.

Having said that, after looking at some of the test changes, they seem to be minor things, like changing timeouts, not changing the actual intended semantics of the tests. But it's too much code to review everything, so I might be completely wrong about that, and in real-world usage, even minor changes like these will cause issues.

rzmmm 2 hours ago | parent | prev | next [-]

I doubt it will end up as stable release very soon, but I'm happy to be proven wrong. I have some skepticism about this whole rewrite, Jarred Sumner has enormous internet following and it feels like an ad.

fragmede an hour ago | parent [-]

How do you wash to define ad, and why does it matter? If I tell you I had lunch, I mean. okay, great. If I tell you I had a delicious Coca-Cola with my lunch, sure. If I happen to work at Coca-Cola, does that now become an ad? And what level does it become an issue? And I what is the issue?

q3k 2 hours ago | parent | prev | next [-]

> solving the ,,tests not pass'' problem by changing the tests themselves

https://github.com/oven-sh/bun/pull/30412/changes/68a34bf8ed...

This is great! Just add a random sleep(1) to a test, don't worry about it, it's going to be fine!

onli an hour ago | parent | next [-]

On the other hand, the sleep fits better to the test description, "should allow reading stdout after a few milliseconds". Even if 1 != 'a few'. It's possible the part of the commit reverted here, https://github.com/oven-sh/bun/commit/a42bf70139980c4d13cc55..., defeated the purpose of the test by removing the sleep. I don't think adding the sleep back is an example of AI cheating.

Strange test though either way.

robryan an hour ago | parent | prev [-]

To be fair the commit message `revert proc.exited change in spawn.test.ts` suggests the sleep was there originally.

Imustaskforhelp 2 hours ago | parent | prev | next [-]

> I started looking at the commits, and it's basically solving the ,,tests not pass'' problem by changing the tests themselves. The real work of making it working on programs that are already deployed will be just starting now.

Wow, This is definitely quite something for sure.

Can jarred comment about if he has read the commits or not too or respond to your comment, this has basically made me lose the small faith I had in what bun is doing if it turns out to be correct.

xiphias2 2 hours ago | parent [-]

It's OK, we'll see how it goes. He and Antropic are giving it us for free, and nowdays just forking the old version is easy if a project needs that. Even maintenance is much easier using LLMs.

I'm happy it's not a project I'm depending on, but a large enough project had to try this at some point so that we all can learn from how it goes.

I think this is why Antropic bought bun, so that they can sell big code translation as a feature for all the banks with COBOL code that they want to get rid of for a long time.

Still, those banks / enterprises won't appreciate the number of unit test changes.

And I agree with another comment that Codex xhigh is much better for these kinds of tasks, but still hard on this kind of scale.

Jarred 43 minutes ago | parent | prev | next [-]

[dead]

2 hours ago | parent | prev [-]
[deleted]