Remix.run Logo
nl 2 days ago

If you haven't tried these things since Sonnet 4.5 came out then it's time to give them another try.

Sonnet 4.5 and especially Codex 5.1 have completely changed the way I build software.

> The fact that tools like Replit also include their own hosting environments is definitely neat, but not really what I was getting at as far as deployment. What I had in mind was managing arbitrary cloud platforms, setting up an optimal architecture for your anticipated scale and usage patterns — whether that's a single Hetzner instance with SQLite or horizontally scaled app servers behind an API gateway with Kafka, Valkey, and Spanner or ScyllaDB — and doing all the DevOps to handle that along with things like CI/CD.

I think this is all possible now. But I don't think it'd work first time because there are so many environmental issues (service auth etc) that can go wrong. Maybe it'd be ok if you have it a root AWS account...

buu700 2 days ago | parent [-]

Just in case it was unclear, I extensively use AI and agentic coding with current models on a daily basis. The only thing I haven't tried in a few months is specifically one-shotting a greenfield project.

I know computer-use agents exist, and theoretically have tooling and permission to do all the things a human sitting in front of a computer can. I just haven't heard of anyone successfully claiming to have had one do exactly what I described for a non-toy project in one shot with zero mistakes, or of any tool like Replit claiming to support such a capability.

I'd be very interested to know if my impression is out of date. As in, if I could send a single message to some AI service and say "Here's my credit card, banking info, and entity info/EIN; build me a production-ready Google Drive clone with religious branding and 10x higher pricing called God Drive with native Android/iOS/Linux/macOS/Windows apps, then deploy it to production on an optimal cloud architecture capable of scaling to a billion users at whatever domain name you like best and release the apps to all major app stores/repositories", then go to bed with high confidence that I'd be able to start creating God Drive docs/spreadsheets/presentations for work the following morning.

If that isn't the case, it isn't a criticism of the technology. The fact that we're even seriously discussing the scenario is incredible.

colechristensen 2 days ago | parent [-]

Well... they're not oracles and never will be. The things I'm creating are following recognizable development practices. It's not build-once and done, it's an elaborate design/build/test cycle that happens in many flavors because unless you've already done something and are copying it, that's how you create and language models aren't going to get away from that.

buu700 2 days ago | parent [-]

Whether or not it will one day get there is anyone's guess, but it sounds like we agree that it at least isn't currently there. I brought up that goalpost to illustrate why more efficient models will only improve the aggregate volume and/or quality of output for the foreseeable future, as opposed to creating a glut of supply that destroys the economics of data centers.