Remix.run Logo
nl 15 hours ago

Have you tried Loveable, Replit, V0 etc?

Outside of purchasing the domain and native apps for you they cover a very significant amount of this.

If you insist on Native Apps, it's possible Google Jules could do it. With Gemini 2.5 it wasn't strong enough but I think it has Gemini 3 now which can definitely do native apps just fine.

buu700 8 hours ago | parent [-]

Thanks for the recommendations. Regarding your other comment, Flutter is what I've landed on as well for my next cross-platform app project, and I'm currently in the middle of developing a spec for a fairly complex agentic system that I'm going to try having Codex two-shot (basic project setup + file stubs + exhaustive tests -> manual checkpoint -> TDD the rest).

I haven't tried Lovable, V0, or Jules, but I really like Replit for certain things. Having said that, based on my experience, I would characterize it as an amazing tool for rapid frontend iteration with prototype-level backend creation. I'm sure it's gotten better at one-shotting since I tried Agent 2 with Sonnet 3.7 in May, but would still be very (pleasantly) surprised to see that Agent 3 with current models could meet the incredibly high bar of wholly replacing a human dev team.

The fact that tools like Replit also include their own hosting environments is definitely neat, but not really what I was getting at as far as deployment. What I had in mind was managing arbitrary cloud platforms, setting up an optimal architecture for your anticipated scale and usage patterns — whether that's a single Hetzner instance with SQLite or horizontally scaled app servers behind an API gateway with Kafka, Valkey, and Spanner or ScyllaDB — and doing all the DevOps to handle that along with things like CI/CD.

I'm not downplaying how amazing these capabilities are. Being able to generate high-quality code from natural language feels like magic. But all the parts beyond narrow application code are half of the thing I described:

* I'm saying you should be able to send a single off-the-cuff drunk text to an AI and later find a complete production-ready SaaS startup that fully aligns with a reasonable interpretation of your message.

* The other half of the whole thing is >=human-level execution. If the AI can't autonomously deliver work comparable to what an experienced CTO would (given the same requirements, an arbitrarily large hiring budget, and a stipulation to never contact you again until the work was done), it's not there yet.

Again, none of this is to dunk on agentic coding. My point is that I set an absurdly high bar because I want it to one day be met. Just as a $100 storage budget today is equivalent to $100m a few decades ago, I want to live to see a $100 engineering budget reach equivalency with last decade's $100m.

nl 5 hours ago | parent [-]

If you haven't tried these things since Sonnet 4.5 came out then it's time to give them another try.

Sonnet 4.5 and especially Codex 5.1 have completely changed the way I build software.

> The fact that tools like Replit also include their own hosting environments is definitely neat, but not really what I was getting at as far as deployment. What I had in mind was managing arbitrary cloud platforms, setting up an optimal architecture for your anticipated scale and usage patterns — whether that's a single Hetzner instance with SQLite or horizontally scaled app servers behind an API gateway with Kafka, Valkey, and Spanner or ScyllaDB — and doing all the DevOps to handle that along with things like CI/CD.

I think this is all possible now. But I don't think it'd work first time because there are so many environmental issues (service auth etc) that can go wrong. Maybe it'd be ok if you have it a root AWS account...

buu700 4 hours ago | parent [-]

Just in case it was unclear, I extensively use AI and agentic coding with current models on a daily basis. The only thing I haven't tried in a few months is specifically one-shotting a greenfield project.

I know computer-use agents exist, and theoretically have tooling and permission to do all the things a human sitting in front of a computer can. I just haven't heard of anyone successfully claiming to have had one do exactly what I described for a non-toy project in one shot with zero mistakes, or of any tool like Replit claiming to support such a capability.

I'd be very interested to know if my impression is out of date. As in, if I could send a single message to some AI service and say "Here's my credit card, banking info, and entity info/EIN; build me a production-ready Google Drive clone with religious branding and 10x higher pricing called God Drive with native Android/iOS/Linux/macOS/Windows apps, then deploy it to production on an optimal cloud architecture capable of scaling to a billion users at whatever domain name you like best and release the apps to all major app stores/repositories", then go to bed with high confidence that I'd be able to start creating God Drive docs/spreadsheets/presentations for work the following morning.

If that isn't the case, it isn't a criticism of the technology. The fact that we're even seriously discussing the scenario is incredible.

colechristensen 3 hours ago | parent [-]

Well... they're not oracles and never will be. The things I'm creating are following recognizable development practices. It's not build-once and done, it's an elaborate design/build/test cycle that happens in many flavors because unless you've already done something and are copying it, that's how you create and language models aren't going to get away from that.

buu700 2 hours ago | parent [-]

Whether or not it will one day get there is anyone's guess, but it sounds like we agree that it at least isn't currently there. I brought up that goalpost to illustrate why more efficient models will only improve the aggregate volume and/or quality of output for the foreseeable future, as opposed to creating a glut of supply that destroys the economics of data centers.