Remix.run Logo
justanotheratom 2 days ago

Recently, OpenAI CFO Sarah Friar said,

"We have something called A-SWE, which is the Agentic Software Engineer. And this is where it starts to get really interesting because it’s not just about augmenting a developer, like a Copilot might do, but actually saying, ‘Hey, this thing can go off and build an app.’ It can do the pull request, it can do the QA, it can do the bug testing, it can write the documentation."

https://www.youtube.com/watch?v=2kzQM_BUe7E The relevant discussion about A-SWE begins around the 11:26 mark (686 seconds).

maronato 2 days ago | parent | next [-]

On the few times I've used Cursor or Claude Code for tasks beyond simple tests or features, I found myself spending more time correcting their errors than if I had written the code from scratch.

I like Cursor and use it daily, but none of its models are even close to being able to take nontrivial work. Besides, it quickly gets expensive if you’re using the smarter models.

IMO these AI tools will become to software engineers what CAD is to mechanical and civil engineers. Can they work without it? Sure, but why would they?

theturtletalks 2 days ago | parent [-]

This is because Cursor is not sending the full context even when you drag and drop things inside the chat box.

I started getting worse results from Cursor too. Then, Gemini 2.5 Pro dropped with 1M context, I repomixed my project, popped it into AIStudio, and asked it make me prompts I can feed into Cursor to fix the issues I have.

Gemini has the whole picture and the prompts it creates tell Cursor which items to change how.

Jcampuzano2 2 days ago | parent | prev | next [-]

If they have this why are they hiring, and how much code for openAI itself has it written.

Drakim 2 days ago | parent [-]

...don't get high on your own supply?

It's pretty obvious that these tools are not replacements for developers as of yet. I've tried them, and they are very nifty, and can even do some boring tasks really well, but you can't actually substitute actual developer skill (yet). But everybody is holding their breath because it looks like they might eventually reach that level, and the time-frame for that eventually is unknown.

Jcampuzano2 2 days ago | parent [-]

But thats exactly what she marketed it as and made the claim it already exists, an agent for hire that can do everything a SWE can do.

If this truly exists they'd have no need to hire since it'd force multiply their existing developers.

What better marketing than being able to proudly claim that "OpenAI no longer hires those pesky expensive developers and you can too" because they can improve/multiply the productivity of their existing developers with their innovations.

sandeepkd 2 days ago | parent [-]

Looks like there are engineers with few years on their belt who feel super excited about the capability of these tools and then the executives and business people who have to toe the line since everyone else is doing it. And then on the other hand there are engineers with multiple years of field and subject expertise who are skeptical about the advertised capabilities, however they are either quiet or in the wait-n-watch mode to see how it plays out.

As some one who has been both in engineering and management roles, I feel the manager role (not all but a lot of managers are just information pass through and tool would be more consistent for it) should be relatively easier for the automation. Bit surprised how no one talks about that as a possibility?

foobiekr 2 days ago | parent | prev | next [-]

And you believe this?

Surely then they have no swe reqs right?

justanotheratom 2 days ago | parent | next [-]

just stating facts.

stefan_ 2 days ago | parent | next [-]

The fact being that Sarah "CEO of Nextdoor" Friar stated they have this A-SWE vapor thing? I don't see how it pertains to what the said being factual.

throwaway314155 2 days ago | parent | prev [-]

Yeah and respectfully the facts as stated leave the impression that she's full of shit and/or doesn't know what she's talking about. What she's saying is pure hype for investors probably. Vaporware.

2 days ago | parent | prev [-]
[deleted]
bwfan123 a day ago | parent | prev | next [-]

I use cursor, and these tools dont encode a causal representation of how something works. Hence they cannot root cause and fix bugs. Fixing issues with 99% correctness snowballs into a hot mess very soon due to compounding errors. Somehow this is missed by people who dont actually code for a living.

darth_avocado 2 days ago | parent | prev | next [-]

The question isn’t “can AI code?”, the question is “can AI keep coding?”.

How do any of these companies create “an AI Software Engineer”? Scraping knowledge posted by actual engineers on StackOverflow? Scraping public (& arguably private) GitHub repos created by actual engineers? What happens when all of them are out of a job? AI gets trained on knowledge generated by AI? Where will the incremental gain come from?

It’s like saying I will teach myself to cook better food by only learning from recipe books I created based on the knowledge I already have.

fragmede 2 days ago | parent | next [-]

> AI gets trained on knowledge generated by AI?

This sounds like the ouroboros snake eating its own tail, which it is, but because of tool use letting it compile and run code, it can generate code for, say, rust that does a thing, iterate until it's gotten the borrow checker to not be angry, then run the code to assert it does what it claims to, and then feed that working code into the training set as good code (and the non-working code as bad). Even using only the recipe books you already had, doing a lot of cooking practice would make you a better cook, and once you learn the recepies in the book well, mixing and matching recepies; egg preparation from one, flour ratios from another, is simply just something a good cook would just get a feel for what works and what doesn't, even if they only ever used that one book.

ethanwillis 2 days ago | parent [-]

The original recipe books don't cover all possible things that could be created, not even in all their combinations. And most importantly even the subset of novel combinations that can be created from the recipe books -- there is something missing.

What's missing is the judgement call of a human to say if some newly created information makes sense to us, is useful to us, etc.

The question above is not about whether new information can be created or the navigation of it. It's about the applicability of what is created to human ends.

darth_avocado 2 days ago | parent [-]

> What's missing is the judgement call of a human to say if some newly created information makes sense to us, is useful to us, etc.

When I gave an example of a recipe book, that’s what I meant. There’s the element of not knowing whether something worked without the explicit feedback of “what worked”. But there is also an element of “no matter how much I experiment with new things, I wouldn’t know sous vide exists as a technique unless I already know and have it listed in the recipe book.” What I don’t know, I will never know.

janalsncm 2 days ago | parent | prev [-]

I can’t think of a better argument for never posting to Stack Overflow again. Or public GitHub.

rchaud 2 days ago | parent | prev | next [-]

Out of interest, why is the CFO the person commenting on this opposed to a product person?

justanotheratom 2 days ago | parent [-]

my thought exactly.

wobblyasp 2 days ago | parent | prev | next [-]

If they actually had something like that they'd stop talking about it and release it.

Until we play with it, it doesn't exist.

bluelightning2k 2 days ago | parent [-]

They did. This week. The codex cli thing

quantumHazer 2 days ago | parent | prev | next [-]

All I will say is that she is the CFO of a company that want to sell """agentic""" swe models.

edit: typo.

2 days ago | parent | prev | next [-]
[deleted]
bhl 2 days ago | parent | prev | next [-]

That's very ambitious. You have companies and startups at each of those layers right now: code, pull request reviews, issue tracking, documentation.

2 days ago | parent | prev | next [-]
[deleted]
fuzzy_biscuit 2 days ago | parent | prev [-]

Vaporware until it isn't.