|
| ▲ | HappySweeney an hour ago | parent | next [-] |
| Code review is the main thing I use LLMs for. I have found it to be remarkably candid when you tell it the code came from another LLM (even name it). I was running Kimi K2.6 Q4 locally, seeing if it could SIMD a bit-matrix transpose function, and it was slow enough that I would paste its thinking into Gemini every few minutes. Gemini was savage. |
| |
| ▲ | datsci_est_2015 5 minutes ago | parent [-] | | > Gemini was savage. Humorously, this could be the result of LLMs vacuuming up all the sentiment on the web that the code that LLMs produce is trash-tier. |
|
|
| ▲ | marcosdumay 25 minutes ago | parent | prev | next [-] |
| Lol, the only thing worse than a junior developer following Clean Code and SOLID has to be an LLM messing with code so it looks like it follows. |
| |
| ▲ | giancarlostoro 6 minutes ago | parent [-] | | Clean Code has its really "meh" areas, but the core idea and spirit of it is sound, heck Python's best guide is PEP-8 if you follow that, it forces you to write much better Python code. In terms of "junior dev following" it would be the model trying to think and write it as a Senior or Staff Level engineer would. |
|
|
| ▲ | kenjackson 41 minutes ago | parent | prev | next [-] |
| This is it. I've had a similar experience in just playing around I asked it to clean up some code it wrote to increase maintainability and readability by humans. After a few iterations it had generated quite solid code. It also broke the code a couple of times along the way. But it does get me thinking that these pipelines with agents doing specific tasks makes a lot of sense. One to design and architect, one to implement, one to clean, one to review, one to test (actually there's probably a bunch of different agents for testing -- testing perf/power, that it matches the requirements/spec, matches the design, is readable/maintainable, etc...). |
| |
| ▲ | giancarlostoro 36 minutes ago | parent [-] | | I built GuardRails after some frustrations with Beads which I love, and this whole exchange made me realize, because I have "gates" after tasks, I could add a "Review the code" type of gate, and probably get insanely better output, I already get reasonably good output because I spec out the requirements beforehand, that's the other thing, if you can tell the LLM HOW to build before it does, you will have better output. |
|
|
| ▲ | enraged_camel 32 minutes ago | parent | prev [-] |
| Even better, if you have access to multiple models, tell it you got the code from another AI agent. I did an experiment on this a few weekends ago and Codex for example was a lot more adversarial and thorough in its review when given Claude-authored code compared to when given the same code with "I wrote this, can you review it?" |
| |
| ▲ | giancarlostoro 31 minutes ago | parent [-] | | If it's within its context window, it will know you're lying, so either compact or start a new chat (don't do this on Claude, it dings your usage, always has). |
|