Remix.run Logo
ChrisMarshallNY 6 hours ago

> AI has gotten so good that despite any misgivings, “everyone is using A.I.”

In my experience, it's a mixed bag. I wrote this comment[0], yesterday. It reflects my current work, and how I am integrating an LLM.

I have used it for two parts of my project:

1) The backend (PHP), and

2) The frontend (Swift)

It has been a huge help, in both, but #2 is a cautionary tale. It really needs adult supervision, in developing native UIKit Swift apps. I'm realizing how truly bad the code it wrote was. I mean, terrible.

That's jarring, because it did a great job with #1. It made sound, reasonable design decisions, and provided code that is better than what I would write.

With #2, it behaved exactly like an inexperienced engineer, panicking, when confronted with real-world problems. My rewrite is going to feature a much simpler, sound approach.

All that said, it has been a net positive, and has increased my productivity by a large margin.

I guess the lesson I needed to get from this, is that it is good at helping me to find problems, but maybe not so good at fixing them.

[0] https://news.ycombinator.com/item?id=48515217

fireflash38 a minute ago | parent | next [-]

I feel like AI fails the XY problem constantly. And it's one thing that I know people hated so much on Stack overflow.

ablob 5 hours ago | parent | prev | next [-]

I'd like to add that there is almost no way of "running away" from it. If I search for anything on the internet I am almost guaranteed to be handed pages and pages of AI generated content. In lieu of that I found that directly prompting for an answer tends to yield better results nowadays. Not because it's good per-se, but because having control over the prompt beats having little to no control over it though search by proxy.

It saddens me to see that high quality content is drowned in this sea of garbage to the point of being almost impossible to find.

gombosg 3 hours ago | parent [-]

I think this is where the circle closes with the "dead internet theory"... you go to Reddit, and see bots commenting on posts created by bots.

Then you go on to search for something, and find only results that are clearly AI generated pages and come to the conclusion that directly prompting some LLM is better than reading an AI slop page that's output by the same AI for slightly less specific prompt.

My concern is that this will only get worse over time - which is great for companies selling AI tokens and bad for society and whoever wants to interact with other humans over the internet.

junon 5 hours ago | parent | prev | next [-]

This would be expected. The corner cases people faced with PHP throughout the decades have been well documented on the internet for eons.

Swift, not so much. It's relatively new. Looking at AI's abilities like an engineer's career span scaled about 10-20x of time makes it make a bit more sense.

It's going to be worse at newer/niche things, intuitively - which is only going to get worse as it "learns" from garbage outputted by other LLMs moving forward.

ChrisMarshallNY 5 hours ago | parent | next [-]

Also, I suspect most "production" Swift –the type of stuff written by seasoned experts– (I just had to add em-dashes ;) is behind closed-source walls.

vinnymac 5 hours ago | parent [-]

No doubt in my mind, a future Apple model will be the best to use for this purpose. They likely have more swift to train on than anyone else, and would benefit directly from more quality apps, rather than the slop flowing into the App Store (>1k app submissions per hour; they claim)

red75prime 5 hours ago | parent | prev [-]

> which is only going to get worse as it "learns" from garbage outputted by other LLMs moving forward

You seem to assume that autoregressive pretraining (and unfiltered behavior cloning, maybe) are the only ways to improve LLM performance.

argee 5 hours ago | parent | prev | next [-]

That's just one way to use LLMs though. Recently on a flight I could not figure out how to connect my wife's earphones (i.e. put them in pairing mode) to my macbook since I was used to the old Airpods Pro case. So I asked Gemma4 26B A4B (offline, LM Studio) and was told to use the 'two tap on front of case' gesture, which worked. This situation would have been significantly more frustrating without (local) LLMs. I'm essentially carrying around a basic "how to" on everything, inaccurate though it may be, it's better than nothing.

ChrisMarshallNY 5 hours ago | parent [-]

Absolutely. I use it often, for stuff I used to "just Google." Other than a predilection for giving me CLI walkthroughs, it is usually fine.

anukin 4 hours ago | parent | prev | next [-]

My experience was different. I found it extremely good at fronting technology like react while I had to hand hold it for the backend tasks. Even with fable it was the same.

wesselbindt 5 hours ago | parent | prev | next [-]

Would you describe yourself as more skilled at frontend engineering or at backend engineering?

ChrisMarshallNY 5 hours ago | parent [-]

Definitely frontend (it's what I do, every day, and I enjoy it), but I have a great deal of experience (over 25 years), writing some pretty robust backend stuff. I just don't enjoy it as much.

wesselbindt 5 hours ago | parent [-]

I'm nowhere near that level of experience, although I've done both as well. I'm more backend oriented. And my experience has been the opposite. When I ask for backend code, footgun after footgun appears on my screen. With frontend code, much less of an issue, as far as I can tell. Part of me believes this is because I'm less skilled at frontend, and I don't bat an eye when the LLM plops down yet another useMemo (I've since learned that this is rarely needed). But in your case this argument can hardly be made. With 25 years I trust your ability to spot a good design on either end of the stack. So then I don't know where this discrepancy comes from. Maybe my prompting skills leave something to be desired.

lilbigdoot an hour ago | parent | next [-]

I wonder if it's expertise gives you ability to see flaws and push the LLM past its acceptable point.

I haven't really used LLMs much for coding (sabbatical before LLMs got good at coding, now looking for work) but I found with chats that they are great at exploring well trodden territory but as soon as you go a little bit off the beaten path they flail horribly

ChrisMarshallNY an hour ago | parent [-]

I suspect that you have an excellent point.

They both do acceptably (but PHP better), as long as I don't push hard. The Swift that I get is ... meh, usually.

However, my PHP server, by design, is extremely conservative. It's meant to run on cheap shared hosting. I don't push the edges. The LLM seems to do a great job of respecting that, while still giving me good, modern, code.

The swift, on the other hand, has highly optimized UI (which also means that I'm not using SwiftUI). It shits the bed, when I push it.

ChrisMarshallNY 5 hours ago | parent | prev [-]

I don't do "megascale" backends, though. My code is generally smaller-scale stuff that's designed to be deployed on a wide variety of cheap hosting, and is pretty conservative. It doesn't "push the limits."

I'm unlikely to run into many of the problems that (for example) the PornHub developers hit, several times an hour.

In that case, I benefit from folks like you, that allow me to have solutions that scale down to my level.

lawgimenez 5 hours ago | parent | prev | next [-]

Well Apple just released a bunch of Agent Skills. I tried it on my macOS apps and I noticed some improvements codewise and updated some deprecations I didn’t know existed in Swift.

ChrisMarshallNY 5 hours ago | parent [-]

Looking forward to that.

lawgimenez 5 hours ago | parent [-]

Yeah it comes with Xcode 27

serial_dev 2 hours ago | parent | prev | next [-]

Isn’t it because Swift (and SwiftUI, if you used that) changes the recommended approach to solving X every 18 months?

mrtksn 5 hours ago | parent | prev | next [-]

In my experience the language has become irrelevant for me, I created a system like mix of revenuecat and firebase and I’m not even sure what language which part is. It has client side libraries that are swift and kotlin, the Identity management is Swift but the iAP/Subscription tracking is go IIRC. It’s all integrated somehow and works very well.

ChrisMarshallNY 5 hours ago | parent [-]

That's the thing, the Swift works fine, but is incredibly brittle. I think it would collapse, at the first bump in the road.

That's fine, for a lot of corporate applications, but not for the stuff I write. I'm anal, I know, but that's how I roll.

bicx 5 hours ago | parent | prev | next [-]

Which LLM though? Models can still be significantly different in their capabilities.

ChrisMarshallNY 5 hours ago | parent [-]

That's likely. I generally use ChatGPT (latest), but as a chat interface (not an agent). I suspect that I might get better stuff from Claude (maybe).

bicx an hour ago | parent [-]

As a Claude Code user, you almost certainly will. Even using the same OpenAI model with an agent like Codex will likely perform better. An agent can test and iterate on its own solution until it meets the defined success metrics. If you’re working within an existing code base, having preexisting code to demonstrate similar implementations also significantly improves the quality of results, and that’s something an agent can dig into as part of its solution process. I haven’t used ChatGPT in a while, so maybe it’s more sophisticated than it used to be, but I think you’ll see much better results with the latest agent tools.

altern8 5 hours ago | parent | prev | next [-]

Might be because there are less Swift projects to train with.

But I've seen Claude write crazy code in Python and JavaScript, too

ChrisMarshallNY 5 hours ago | parent | next [-]

My theory is that most of the Swift code in the public domain, is basically demo code. Short, idealized, code samples to demonstrate issues and solutions; much like you would see in StackOverflow.

PHP has huge, entire frameworks and systems, refined over years.

graemep 5 hours ago | parent [-]

There is also a lot of low quality PHP code out there, and a lot of legacy code in a language that I am told (I have not used if for years myself though) has changed a lot.

ChrisMarshallNY 5 hours ago | parent [-]

Same with C++. You don't want to write C++, the way that I used to.

That's one of the things that I appreciate about the PHP that the LLM provides. It uses modern idioms that make better use of the modern language.

graemep 5 hours ago | parent | prev [-]

I do not know about crazy, but certainly sub-optimal. For example a loop over DB query results instead of modifying the code to work with a single query.

ChrisMarshallNY 5 hours ago | parent [-]

I found that asking it to refactor for performance and safety often addresses these issues.

zeroonetwothree 5 hours ago | parent | prev [-]

I’m going to guess you are better at frontend than backend.

The classic AI Gell-Mann effect.

ChrisMarshallNY 4 hours ago | parent [-]

The guess is correct.

The diagnosis, however, is not.

Have a great day!