| ▲ | porise 13 hours ago | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
I wish the people who wrote this let us know what king of codebases they are working on. They seem mostly useless in a sufficiently large codebase especially when they are messy and interactions aren't always obvious. I don't know how much better Claude is than ChatGPT, but I can't get ChatGPT to do much useful with an existing large codebase. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | CSMastermind 2 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Are you using Codex? I'm not sure how big your repos are but I've been effective working with repos that have thousands of files and tens of thousands of lines of code. If you're just prototyping it will hit wall when things get unwieldy but that's normally a sign that you need to refactor a bit. Super strict compiler settings, static analysis, comprehensive tests, and documentation help a lot. As does basic technical design. After a big feature is shipped I do a refactor cycle with the LLM where we do a comprehensive code review and patch things up. This does require human oversight because the LLMs are still lacking judgement on what makes for good code design. The places where I've seen them be useless is working across repositories or interfacing with things like infrastructure. It's also very model-dependent. Opus is a good daily driver but Codex is much better are writing tests for some reason. I'll often also switch to it for hard problems that Claude can't solve. Gemini is nice for 'I need a prototype in the next 10 minutes', especially for making quick and dirty bespoke front-ends where you don't care about the design just the functionality. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | CameronBanga 13 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
This is an antidotal example, but I released this last week after 3 months of work on it as a "nights and weekdends" project: https://apps.apple.com/us/app/skyscraper-for-bluesky/id67541... I've been working in the mobile space since 2009, though primarily as a designer and then product manager. I work in kinda a hybrid engineering/PM job now, and have never been a particularly strong programmer. I definitely wouldn't have thought I could make something with that polish, let alone in 3 months. That code base is ~98% Claude code. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | fy20 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
At my dayjob my team uses it on our main dashboard, which is a pretty large CRUD application. The frontend (Vue) is a horrible mess, as it was originally built by people who know just enough to be dangerous. Over time people have introduced new standards without cleaning up the old code - for example, we have three or four different state management techologies. For this the LLM struggles a bit, but so does a human. The main issues are it messes up some state that it didnt realise was used elsewhere, and out test coverage is not great. We've seen humans make exactly the same kind of mistakes. We use MCP for Figma so most of the time it can get a UI 95% done, just a few tweaks needed by the operator. On the backend (Typescript + Node, good test coverage) it can pretty much one-shot - from a plan - whatever feature you give it. We use opus-4.5 mostly, and sometimes gpt-5.2-codex, through Cursor. You aren't going to get ChatGPT (the web interface) to do anything useful, switch to Cursor, Codex or Claude Code. And right now it is worth paying for the subscription, you don't get the same quality from cheaper or free models (although they are starting to catch up, I've had promising results from GLM-4.7). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | TaupeRanger 13 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Claude and Codex are CLI tools you use to give the LLM context about the project on your local machine or dev environment. The fact that you're using the name "ChatGPT" instead of Codex leads me to believe you're talking about using the web-based ChatGPT interface to work on a large codebase, which is completely beside the point of the entire discussion. That's not the tool anyone is talking about here. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | danielvaughn 13 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
It's important to understand that he's talking about a specific set of models that were release around november/december, and that we've hit a kind of inflection point in model capabilities. Specifically Anthropic's Opus 4.5 model. I never paid any attention to different models, because they all felt roughly equal to me. But Opus 4.5 is really and truly different. It's not a qualitative difference, it's more like it just finally hit that quantitative edge that allows me to lean much more heavily on it for routine work. I highly suggest trying it out, alongside a well-built coding agent like the one offered by Claude Code, Cursor, or OpenCode. I'm using it on a fairly complex monorepo and my impressions are much the same as Karpathy's. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | yasoob 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Another personal example. I spent around a month last year in January on this application: https://apps.apple.com/us/app/salam-prayer-qibla-quran/id674... I had never used Swift before that and was able to use AI to whip up a fairly full-featured and complex application with a decent amount of code. I had to make some cross-cutting changes along the way as well that impacted quite a few files and things mostly worked fine with me guiding the AI. Mind you this was a year ago so I can only imagine how much better I would fare now with even better AI models. That whole month was spent not only on coding but on learning Swift enough to fix problems when AI started running into circles and then learning about Xcode profiler to optimize the application for speed and improving perf. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | keerthiko 13 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Almost always, notes like these are going to be about greenfield projects. Trying to incorporate it in existing codebases (esp when the end user is a support interaction or more away) is still folly, except for closely reviewed and/or non-business-logic modifications. That said, it is quite impressive to set up a simple architecture, or just list the filenames, and tell some agents to go crazy to implement what you want the application to do. But once it crosses a certain complexity, I find you need to prompt closer and closer to the weeds to see real results. I imagine a non-technical prompter cannot proceed past a certain prototype fidelity threshold, let alone make meaningful contributions to a mature codebase via LLM without a human engineer to guide and review. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | gwd 9 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
For me, in just the golang server instance and the core functional package, `cloc` reports over 40k lines of code, not counting other supporting packages. I spent the last week having Claude rip out the external auth system and replace it with a home-grown one (and having GPT-codex review its changes). If anything, Claude makes it easier on me as a solo founder with a large codebase. Rather than having to re-familiarize myself with code I wrote a year ago, I describe it at a high level, point Claude to a couple of key files, and then tell it to figure out what it needs to do. It can use grep, language server, and other tools to poke around and see what's going on. I then have it write an "epic" in markdown containing all the key files, so that future sessions already know the key files to read. I really enjoyed the process. As TFA says, you have to keep a close eye on it. But the whole process was a lot less effort, and I ended up doing mor than I would otherwise have done. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | ph4te 13 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
I don't know how big sufficiently large codebase is, but we have a 1mil loc Java application, that is ~10years old, and runs POS systems, and Claude Code has no issues with it. We have done full analyses with output details each module, and also used it to pinpoint specific issues when described. Vibe coding is not used here, just analysis. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | BeetleB 8 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
> They seem mostly useless in a sufficiently large codebase especially when they are messy and interactions aren't always obvious. What type of documents do you have explaining the codebase and its messy interactions, and have you provided that to the LLM? Also, have you tried giving someone brand new to the team the exact same task and information you gave to the LLM, and how effective were they compared to the LLM? > I don't know how much better Claude is than ChatGPT, but I can't get ChatGPT to do much useful with an existing large codebase. As others have pointed out, from your comment, it doesn't sound like you've used a tool dedicated for AI coding. (But even if you had, it would still fail if you expect LLMs to do stuff without sufficient context). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | jwr 4 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
I successfully use Claude Code in a large complex codebase. It's Clojure, perhaps that helps (Clojure is very concise, expressive and hence token-dense). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | smusamashah 9 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
The code base I work on at $dayjob$ is legacy, has few files with 20k lines each and a few more with around 10k lines each. It's hard to find things and connect dots in the code base. Dont think LLMs able to navigate and understand code bases of that size yet. But have seen lots of seemingly large projects shown here lately that involve thousands of files and millions of lines of code. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | tunesmith 13 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
If you have a ChatGPT account, there's nothing stopping you from installing codex cli and using your chatgpt account with it. I haven't coded with ChatGPT for weeks. Maybe a month ago I got utility out of coding with codex and then having ChatGPT look at my open IDE page to give comments, but since 5.2 came out, it's been 100% codex. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | bluGill 10 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
I've been trying Claude on my large code base today. When I give it the requirements I'd give an engineer and so "do it" it just writes garbage that doesn't make sense and doesn't seem to even meet the requirements (if it does I can't follow how - though I'll admit to giving up before I understood what it did, and I didn't try it on a real system). When I forced it to step back and do tiny steps - in TDD write one test of the full feature - it did much better - but then I spent the next 5 hours adjusting the code it wrote to meet our coding standards. At least I understand the code, but I'm not sure it is any faster (but it is a lot easier to see things wrong than come up with green field code). Which is to say you have to learn to use the tools. I've only just started, and cannot claim to be an expert. I'll keep using them - in part because everyone is demanding I do - but to use them you clearly need to know how to do it yourself. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | epolanski 8 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
1. Write good documentation, architecture, how things work, code styling, etc. 2. Put your important dependencies source code in the same directory. E.g. put a `_vendor` directory in the project, in it put the codebase at the same tag you're using or whatever: postgres, redis, vue, whatever. 3. Write good plans and requirements. Acceptance criteria, context, user stories, etc. Save them in markdown files. Review those multiple times with LLMs trying to find weaknesses. Then move to implementation files: make it write a detailed plan of what it's gonna change and why, and what it will produce. 4. Write very good prompts. LLMs follow instructions well if they are clear "you should proactively do X", is a weak instruction if you mean "you must do X". 5. LLMs are far from perfect, and full of limits. Karpathy sums their cons very well in his long list. If you don't know their limits you'll mismanage the expectations and not use them when they are a huge boost and waste time on things they don't cope well with. On top of that: all LLMs are different in their "personality", how they adhere to instruction, how creative they are, etc. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | Okkef 13 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Try Claude code. It’s different. After you tried it, come back. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | datsci_est_2015 6 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Also I never see anyone talking about code reviews, which is one of the primary ways that software engineering departments manage liability. We fired someone recently because they couldn’t explain any of the slop they were trying to get merged. Why tf would I accept the liability of managing code that someone else can’t even explain? I guess this is fine when you don’t have customers or stakeholders that give a shit lol. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | redox99 8 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
What do you even mean by "ChatGPT"? Copy pasting code into chatgpt.com? AI assisted coding has never been like that, which would be atrocious. The typical workflow was using Cursor with some model of your choice (almost always an Anthropic model like sonnet before opus 4.5 released). Nowadays (in addition to IDEs) it's often a CLI tool like Claude Code with Opus or Codex CLI with GPT Codex 5.2 high/xhigh. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | languid-photic 12 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
They build Claude Code fully with Claude Code. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | maxdo 13 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
chatGPT is not made to write code. Get out of stone age :) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | spaceman_2020 13 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
I'm afraid that we're entering a time when the performance difference between the really cutting edge and even the three-month-old tools is vast If you're using plain vanilla chatgpt, you're woefully, woefully out of touch. Heck, even plain claude code is now outdated | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||