Remix.run Logo
alkonaut 2 hours ago

All of this might as well be greek to me. I use ChatGPT and copy paste code snippets. Which was bleeding edge a year or two ago, and now it feels like banging rocks together when reading these types of articles. I never had any luck integrating agents, MCP, using tools etc.

Like if I'm not ready to jump on some AI-spiced up special IDE, am I then going to just be left banging rocks together? It feels like some of these AI agent companies just decided "Ok we can't adopt this into the old IDE's so we'll build a new special IDE"?_Or did I just use the wrong tools (I use Rider and VS, and I have only tried Copilot so far, but feel the "agent mode" of Copilot in those IDE's is basically useless).

prettygood 2 hours ago | parent | next [-]

I'm so happy someone else says this, because I'm doing exactly the same. I tried to use agent mode in vs code and the output was still bad. You read simple things like: "We use it to write tests". I gave it a very simple repository, said to write tests, and the result wasn't usable at all. Really wonder if I'm doing it wrong.

kace91 an hour ago | parent | next [-]

I’m not particularly proAI but I struggle with the mentality some engineers seem to apply to trying.

If you read someone say “I don’t know what’s the big deal with vim, I ran it and pressed some keys and it didn’t write text at all” they’d be mocked for it.

But with these tools there seems to be an attitude of “if I don’t get results straight away it’s bad”. Why the difference?

Macha an hour ago | parent | next [-]

There isn't a bunch of managers metaphorically asking people if they're using vim enough, and not so many blog posts proclaiming vim as the only future for building software

kace91 21 minutes ago | parent [-]

I’d argue that, if we accept that AI is relevant enough to at least be worth checking, then dismissing it with minimal effort is just as bad as mindlessly hyping the tech.

neumann 31 minutes ago | parent | prev | next [-]

I agree to a degree, but I am in that camp. I subscribe to alphasignal, and every morning there are 3 new agent tools, and two new features, and a new agentic approach, and I am left wondering, where is the production stuff?

galaxyLogic 41 minutes ago | parent | prev [-]

Well one could say that since it's AI, AI should be able to tell us what we're doing wrong. No?

AI is supposed to make our work easier.

kace91 19 minutes ago | parent [-]

What you are doing wrong in respect to what? If you ask for A, how would any system know that you actually wanted to ask for B?

embedding-shape an hour ago | parent | prev | next [-]

You didn't actually just say "write tests" though right? What was the actual prompt you used?

I feel like that matters more than the tooling at this point.

I can't really understand letting LLMs decide what to test or not, they seem to completely miss the boat when it comes to testing. Half of them are useless because they duplicate what they test, and the other half doesn't test what they should be testing. So many shortcuts, and LLMs require A LOT of hand-holding when writing tests, more so than other code I'd wager.

sixtyj 32 minutes ago | parent | prev | next [-]

No, you have similar experience as a lot of people have.

LLMs just fail (hallucinate) in less known fields of expertise.

Funny: Today I have asked Claude to give me syntax how to run Claude Code. And its answer was totally wrong :) So you go to documentation… and its parts are obsolete as well.

LLM development is in style “move fast and break things”.

So in few years there will be so many repos with gibberish code because “everybody is coder now” even basketball players or taxi drivers (no offense, ofc, just an example).

It is like giving F1 car to me :)

agumonkey an hour ago | parent | prev [-]

you need to write a test suite to check his test generation (soft /s)

CurleighBraces 2 hours ago | parent | prev | next [-]

Yeah if you've not used codex/agent tooling yet it's a paradigm shift in the way of working, and once you get it it's very very difficult to go back to the copy-pasta technique.

There's obviously a whole heap of hype to cut through here, but there is real value to be had.

For example yesterday I had a bug where my embedded device was hard crashing when I called reset. We narrowed it down to the tool we used to flash the code.

I downloaded the repository, jumped into codex, explained the symptoms and it found and fixed the bug in less than ten minutes.

There is absolutely no way I'd of been able to achieve that speed of resolution myself.

tmountain 2 hours ago | parent | prev | next [-]

I used to do it the way you were doing it. A friend went to a hackathon and everyone was using Cursor and insisted that I try it. It lets you set project level "rules" that are basically prompts for how you want things done. It has access to your entire repo. You tell the agent what you want to do, and it does it, and allows you to review it. It's that simple; although, you can take it much further if you want or need to. For me, this is a massive leap forward on its own. I'm still getting up to speed with reproducible prompt patterns like TFA mentions, but it's okay to work incrementally towards better results.

ramraj07 42 minutes ago | parent | prev | next [-]

I recently pasted an error I found into claude code and asked who broke this. It found the commit and also found that someone else had fixed it in their branch.

You should use claude code.

bojan 38 minutes ago | parent [-]

There's no reason this should not be possible in other IDEs, except for the vendor lock-in.

embedding-shape 2 hours ago | parent | prev | next [-]

> I never had any luck integrating agents

What exactly do you mean with "integrating agents" and what did you try?

The simplest (and what I do) is not "integrating them" anywhere, but just replace the "copy-paste code + write prompt + copy output to code" with "write prompt > agent reads code > agent changes code > I review and accept/reject". Not really "integration" as much as just a workflow change.

breppp an hour ago | parent | prev | next [-]

I also sympathize with that approach, and found it sometimes better than agents. I believe some of the agentic IDEs are missing a "contained mode".

Let me select lines in my code which you are allowed to edit in this prompt and nothing else, for these "add a function that does x" without starting to run amok

wiseowise 42 minutes ago | parent | prev | next [-]

You just didn't drink enough cool-aid and have intact brain.

jonathanstrange an hour ago | parent | prev | next [-]

I'm doing the same. My reason is not the IDE, I just can't let AI agent software onto my machine. I have no trust at all in it and the companies who make this software. I neither trust them in terms of file integrity nor for keeping secrets secret, and I do have to keep secrets like API keys on my file system.

Am I right in assuming that the people who use AI agent software use them in confined environments like VMs with tight version control?

Then it makes sense but the setup is not worth the hassle for me.

dude250711 2 hours ago | parent | prev | next [-]

The idea is to produce such articles, not read them. Do not even read them as the agent is spitting them out - simply feed straight into another agent to verify.

63stack an hour ago | parent [-]

Present it at the next team/management meeting to seem in the loop and hope nobody asks any questions

chrz 41 minutes ago | parent [-]

No questions. It will be pasted into their AI tool. And things will be great. For few weeks at least until something break a nobody will know what

franze an hour ago | parent | prev | next [-]

I am on the other side, I have given the complete control of my computer to Claude Code - Yolo Mode. Sudo. It just works. My servers run the same. I SSH into Claude Code there and let them do whatever work they need to do.

So my 2 cents. Use Claude Code. In Yolo mode. Use it. Learn with it.

Whenever I post something like this I get a lot of downvots. But well ... end of 2026 we will not use computer the way we use them now. Claude Code Feb 2025 was the first step, now Jan 2026 CoWork (Claude Code for everyone else) is here. It is just a much much more powerful way to use computers.

darkwater 28 minutes ago | parent | next [-]

Claude Code and agents are the hot new hammer, and they are cool, I use CC and like it for many things, but currently they suffer from the "hot new hammer" hype so people tend to think everything is a nail the LLM can handle. But you still need a screwdriver for screws, even if you can hammer them in.

jangxx an hour ago | parent | prev [-]

Don't say "we" when talking about yourself.

franze an hour ago | parent [-]

I already do.

And yes, it is a hypothesis about the future. Claude Code was just a first step. It will happen to the rest of computer use as well.

hahahahhaah 2 hours ago | parent | prev | next [-]

I feel like just use claude code. That is it. Use it you get the feel for it. Everyone is over complicating.

It is like learning to code itself. You need flight hours.

_zoltan_ 44 minutes ago | parent | next [-]

It's not that simple. That's how I started as well but now I have hooked up Gemini and GPT 5.2 to review code and plans and then to do consensus on design questions.

And then there's Ralph with cross LLM consensus in a loop. It's great.

cobolexpert 2 hours ago | parent | prev [-]

This is something that continues to surprise me. LLMs are extremely flexible and already come prepackaged with a lot of "knowledge", you don't need to dump hundreds of lines of text to explain to it what good software development practices are. I suspect these frameworks/patterns just fill up the context with unecessary junk.

raesene9 18 minutes ago | parent | next [-]

I think avoiding filling context up with too much pattern information, is partially where agent skills are coming from, with the idea there being that each skill has a set of triggers, and the main body of the skill is only loaded into context, if that trigger is hit.

You could still overload with too many skills but it helps at least.

vidarh an hour ago | parent | prev | next [-]

You get to 80% there (numbers pulled out of the air) by just telling it to do things. You do need more to get from 80% there to 90%+ there.

How much more depends on what you're trying to do and in what language (e.g. "favourite" pet peeve: Claude occasionally likes to use instance_variable_get() in Ruby instead of adding accessors; it's a massive code smell), but there are some generic things, such as giving it instructions on keeping notes and giving them subagents to farm out repetitive tasks to prevent the individual task completion from filling up the context for tasks that are truly independent (in which case, for Claude Code at least, you can also tell it to do multiple in parallel)

But, indeed, just starting Claude Code (or Codex; I prefer Claude but it's a "personality thing" - try tools until you click with one) and telling it to do something is the most important step up from a chat window.

cobolexpert an hour ago | parent [-]

I agree about the small tweaks like the Ruby accessor thing, I also have some small notes like that myself, to nudge the agent in the right direction.

Macha an hour ago | parent | prev | next [-]

If I don't instruct it to in some way, the agent will not write tests, will not conform with the linter standard, will not correctly figure out the command to run a subset of tests, etc.

epolanski an hour ago | parent | prev [-]

> I suspect these frameworks/patterns just fill up the context with unecessary junk.

That's exactly the point. Agents have their own context.

Thus, you try to leverage them by combining ad-hoc instructions for repetitive tasks (such as reviewing code or running a test checklist) and not polluting your conversation/context.

cobolexpert an hour ago | parent [-]

Ah do you mean sub-agents? I do understand that if I summon a sub-agent and give it e.g. code reviewing instructions, it will not fill up the context of the main conversation. But my point is that giving the sub-agent the instruction "review this code as if you were a staff engineer" (literally those words) should cover most use cases (but I can't prove this, unfortunately).

photios an hour ago | parent | prev | next [-]

Copilot's agent mode is a disaster. Use better tools: try Claude Code or OpenCode (my favorite).

It's a new ecosystem with its own (atrocious!) jargon that you need to learn. The good news is that it's not hard to do so. It's not as complex or revolutionary as everyone makes it look like. Everything boils down to techniques and frameworks of collecting context/prompt before handing it over to the model.

darkwater 31 minutes ago | parent [-]

Yep, basically this. In the end it helps having the mental model that (almost) everything related to agents is just a way to send the upstream LLM a better and more specific context for the task you need to solve at that specific time. i.e Claude Code "skills" are simply a markdown file in a subdirectory with a specific name that translates to a `/SKILL_NAME` command in Claude and a prompt that is injected each time that skill is mentioned or Claude thinks it needs to use, so it doesn't forget the specific way you want to handle that specific task.

rustyhancock 2 hours ago | parent | prev [-]

[dead]