Remix.run Logo
swaptr 5 days ago

AI-generated code can be useful in the early stages of a project, but it raises concerns in mature ones. Recently, a 280kloc+ Postgres parser was merged into Multigres (https://github.com/multigres/multigres/pull/109) with no public code review. In open source, this is worrying. Many people rely on these projects for learning and reference. Without proper review, AI-generated code weakens their value as teaching tools, and more importantly the trust in pulling as dependencies. Code review isn’t just about bugs, it’s how contributors learn, understand design choices, and build shared knowledge. The issue isn’t speed of building software (although corporations may seem to disagree), but how knowledge is passed on.

Edit: Reference to the time it took to open the PR: https://www.linkedin.com/posts/sougou_the-largest-multigres-...

sougou 5 days ago | parent [-]

I oversaw this work, and I'm open to feedback on how things can be improved. There are some factors that make this particular situation different:

This was an LLM assisted translation of the C parser from Postgres, not something from the ground up.

For work of this magnitude, you cannot review line by line. The only thing we could do was to establish a process to ensure correctness.

We did control the process carefully. It was a daily toil. This is why it took two months.

We've ported most of the tests from Postgres. Enough to be confident that it works correctly.

Also, we are in the early stages for Multigres. We intend to do more bulk copies and bulk translations like this from other projects, especially Vitess. We'll incorporate any possible improvements here.

The author is working on a blog post explaining the entire process and its pitfalls. Please be on the lookout.

I was personally amazed at how much we could achieve using LLM. Of course, this wouldn't have been possible without a certain level of skill. This person exceeds all expectations listed here: https://github.com/multigres/multigres/discussions/78.

wg002 4 days ago | parent [-]

"We intend to do more bulk copies and bulk translations like this from other projects"

Supabase’s playbook is to replicate existing products and open source projects, release them under open source, and monetize the adoption. They’ve repeated this approach across multiple offerings. With AI, the replication process becomes even faster, though it risks producing low-quality imitations that alienate the broader community and people will resent the stealing of their work.

kiwicopple 4 days ago | parent | next [-]

An alternative viewpoint which we are pretty open about in our docs:

> our technological choices are quite different; everything we use is open source; and wherever possible, we use and support existing tools rather than developing from scratch.

I understand that people get frustrated when there is any commercial interest associated to open source. But someone needs to fund open source efforts and we’re doing our best here. Some (perhaps non-obvious) examples

* we employ the maintainers of PostgREST, contributing directly to the project - not some private fork

* we employ maintainers of Postgres, contributing patches directly

* we have purchased and open sourced private companies, like OrioleDB, open sourced the code and made the patents freely available to everyone

* we picked up unmaintained tools and maintained them at our own cost, like the Auth server, which we upstreamed until the previous owner/company stopped accepting contributions

* we worked with open source tools/standards like TUS to contribute missing functionality like Postgres support and advisory locks

* we have sponsored adjacent open source initiatives like adding types to Elixir

* we have given equity to framework creators, which I’m certain will be the largest donation that these creators have (and will) ever receive for their open source work

* and yes, we employ the maintainers of Vitess to create a similar offering for the Postgres ecosystem under the same Apache2 license

kimixa 4 days ago | parent | prev [-]

And I'm not sure about their ability to release said code under a different license either.

Postgres has a pretty permissive license, but that doesn't mean you can just ignore it.