Look interesting, eager to play around with it! Devstral was a neat model when it released and one of the better ones to run locally for agentic coding. Nowadays I mostly use GPT-OSS-120b for this, so gonna be interesting to see if Devstral 2 can replace it.

I'm a bit saddened by the name of the CLI tool, which to me implies the intended usage. "Vibe-coding" is a fun exercise to realize where models go wrong, but for professional work where you need tight control over the quality, you can obviously not vibe your way to excellency, hard reviews are required, so not "vibe coding" which is all about unreviewed code and just going with whatever the LLM outputs.

But regardless of that, it seems like everyone and their mother is aiming to fuel the vibe coding frenzy. But where are the professional tools, meant to be used for people who don't want to do vibe-coding, but be heavily assisted by LLMs? Something that is meant to augment the human intellect, not replace it? All the agents seem to focus on off-handing work to vibe-coding agents, while what I want is something even tighter integrated with my tools so I can continue delivering high quality code I know and control. Where are those tools? None of the existing coding agents apparently aim for this...

▲ williamstein 19 hours ago | parent | next [-]

Their new CLI agent tool [1] is written in Python unlike similar agents from Anthropic/Google (Typescript/Bun) and OpenAI (Rust). It also appears to have first class ACP support, where ACP is the new protocol from Zed [2].

[1] https://github.com/mistralai/mistral-vibe

[2] https://zed.dev/acp

▲

esafak 19 hours ago | parent | next [-]

I did not know A2A had a competitor :(

	▲	4b11b4 19 hours ago \| parent [-]
		They're different use cases, ACP is for clients (UIs, interfaces)

▲

embedding-shape 18 hours ago | parent | prev [-]

> Their new CLI agent tool [1] is written in

This is exactly the CLI I'm referring to, whose name implies it's for playing around with "vibe-coding", instead of helping professional developers produce high quality code. It's the opposite of what I and many others are looking for.

	▲	chrsw 17 hours ago \| parent [-]
		I think that's just the name they picked. I don't mind it. Taking a glance at what it actually does, it just looks like another command line coding assistant/agent similar to Opencode and friends. You can use it for whatever you want not just "vibe coding", including high quality, serious, professional development. You just have to know what you're doing.

▲ hadlock 15 hours ago | parent | prev | next [-]

>vibe-coding

A surprising amount of programming is building cardboard services or apps that only need to last six months to a year and then thrown away when temporary business needs change. Execs are constantly clamoring for semi-persistent dashboards and ETL visualized data that lasts just long enough to rein in the problem and move on to the next fire. Agentic coding is good enough for cardboard services that collapse when they get wet. I wouldn't build an industrial data lake service with it, but you can certainly build cardboard consumers of the data lake.

	▲	bigiain 10 hours ago \| parent [-]
		You are right. But there is nothing more permanent that a quickly hacked together prototype or personal productivity hack that works. There are so many Python (or Perl or Visual Basic) scripts or Excel spreadsheets - created by people who have never been "developers" - which solve in-the-trenches pain points and become indispensable in exactly the way _that_ xkcd shows.

▲ pdntspa 19 hours ago | parent | prev | next [-]

> But where are the professional tools, meant to be used for people who don't want to do vibe-coding, but be heavily assisted by LLMs? Something that is meant to augment the human intellect, not replace it?

Claude Code not good enough for ya?

▲

embedding-shape 18 hours ago | parent [-]

Claude Code has absolutely zero features that help me review code or do anything else than vibe-coding and accept changes as they come in. We need diff-comparisons between different executions, tailored TUI for that kind of work and more. Claude Code is basically a MVP of that.

Still, I do use Claude Code and Codex daily as there is nothing better out there currently. But they still feel tailored towards vibe-coding instead of professional development.

▲

vidarh 17 hours ago | parent | next [-]

I really do not want those things in Claude COde - I much prefer choosing my own diff tools etc. and running them in a separate terminal. If they start stuffing too much into the TUI they'd ruin it - if you want all that stuff built in, they have the VS Code integration.

▲

Havoc 10 hours ago | parent | next [-]

Mind elaborating a bit on the diff tool / flow you’re using? Trying to follow along better with what CC is doing

	▲	jbs789 4 hours ago \| parent [-]
		Claude code run in a VS Code terminal window pops up a diff in VSCode before making changes. Not sure if that helps. I do have the Claude Code extension installed too. I find the flow works bc if it starts going off piste I just end it. Plus I then get my pre-commit hooks etc. I still like being relatively hands on though.

▲

embedding-shape 16 hours ago | parent | prev [-]

Me neither, hence the stated preference for something completely new and different, a stab in the different direction instead of the same boring iteration on yet another agentic TUI coder.

▲

pdntspa 6 hours ago | parent | prev | next [-]

IntelliJ's AI service as a PR summarizer that I have found very helpful

▲

johnfn 16 hours ago | parent | prev | next [-]

> Claude Code has absolutely zero features that help me review code

Err, doesn’t it have /review?

▲

victorbjorklund 16 hours ago | parent | prev [-]

What’s wrong with using GIT for reviewing the changes?

▲

embedding-shape 14 hours ago | parent [-]

Are any of them integrated with git? AFAIK, you'd have to instruct them to use git for you if you don't want to do it manually.

Imagine a GUI built around git branches + agents working in those branches + tooling to manage the orchestration and small review points, rather than "here's a chat and tool calling, glhf".

	▲	KronisLV an hour ago \| parent \| next [-]
		> Are any of them integrated with git? All of the models that can do tool calls are typically good enough to use Git. Just this week I used both Claude Code and Codex to look at unstaged/staged changes and to review them multiple times, even do comparison between a feature branch and the main branch to identify why a particular feature might have broken in the feature branch.
	▲	zer0tonin 16 minutes ago \| parent \| prev [-]
		Aider is integrated with git

▲ jbellis 16 hours ago | parent | prev | next [-]

> where are the professional tools, meant to be used for people who don't want to do vibe-coding, but be heavily assisted by LLMs?

This is what we're building at Brokk: https://brokk.ai/

Quick intro: https://blog.brokk.ai/introducing-lutz-mode/

▲ johanvts 19 hours ago | parent | prev | next [-]

Did you try Aider?

▲ embedding-shape 18 hours ago | parent [-]

I did, although a long time ago, so maybe I need to try it again. But it still seems to be stuck in a chat-like interface instead of something tailored to software development. Think IDE but better.

▲ vidarh 17 hours ago | parent | next [-]

When I think "IDE but better", a Claude Code-like interface is increasingly what I want.

If you babysit every interaction, rather than reviewing a completed unit of work of some size, you're wasting your time second-guessing that the model won't "recover" from stupid mistakes. Sometimes that's right, but more often than not it corrects itself faster than you can.

And so it's far more effective to interact with it far more async, where the UI is more for figuring out what it did if something doesn't seem right, than for working live. I have Claude writing a game engine in another window right now, while writing this, and I have no interest in reviewing every little change, because I know the finished change will look nothing like the initial draft (it did just start the demo game right now, though, and it's getting there). So I review no smaller units of change than 30m-1h, often it will be hours, sometimes days, between each time I review the output, when working on something well specified.

▲ johanvts 17 hours ago | parent | prev | next [-]

It has a new “watch files” mode where you can work interactively. You just code normally but can send commands to the llm via a special string. Its a great way if interacting with LLMs, if only they where much faster.

	▲	macNchz 15 hours ago \| parent [-]
		If you're interested in much faster LLM coding, GLM 4.6 on Cerebras is pretty mind blowing. It's not quite as smart as the latest Claude and Gemini, but it generates code so fast it's kind of comical if you're used to the other models. Good with Aider since you can keep it on a tighter leash than with a fully agentic tool.

▲ reachtarunhere 16 hours ago | parent | prev | next [-]

If your goal is to edit code and not discuss it aider also supports a watch mode. You can keep adding comments about what you want it to do in a minimal format and it will make changes to the files and you can diff/revert them.

▲ zmmmmm 14 hours ago | parent | prev [-]

I think Aider is closest to what you want.

The chat interface is optimal to me because you often are asking questions and seeking guidance or proposals as you are making actual code changes. On reason I do like it is that its default mode of operation is to make a commit for each change it makes. So it is extremely clear what the AI did vs what you did vs what is a hodge podge of both.

As others have mentioned, you can integrate with your IDE through the watch mode. It's somewhat crude but still useful way. But I find myself more often than not just running Aider in a terminal under the code editor window and chatting with it about what's in the window.

▲ embedding-shape 14 hours ago | parent | next [-]

> I think Aider is closest to what you want.

> The chat interface

Seems very much not, if it's still a chat interface :) Figuring out a chat UX is easy compared to something that was creating with letting LLM fill in some parts from the beginning. I guess I'm searching for something with a different paradigm than just "chat + $Something".

▲

zmmmmm 12 hours ago | parent [-]

the question is, how do you want to provide instructions for what the AI is to do? You might not like calling it "chat" but somehow you need to communicate that, right? With aider you can write a comment for a function and then instruct it to finish the function inline (see other comments). But unless you just want pure autocomplete based on it guessing things, you need to provide guidance to it somehow.

▲

embedding-shape 12 hours ago | parent [-]

I don't know exactly, but I guess in a more declarative manner rather than anything. Maybe we set goals/milestones/concrete objectives, or similar, rather than imperatively steer it, give it space to experiment yet make it very easy to understand exactly what important tradeoffs everything is doing.

It's all very fluffy and theoretical of course.

▲

xmcqdpt2 5 hours ago | parent | next [-]

I think the problem is that models are just not that good yet. At least for my usage at work, the CLI tools are the fastest way to get something useful, but if you can't describe basically exactly what you want, you get garbage.

▲

zmmmmm 11 hours ago | parent | prev | next [-]

I find a good compromise on that front is not to use the chat primarily, but to create files like 'ARCHITECTURE.md', 'REQUIREMENTS.md' and put information in there describing how the application works. Then you add those to the chat as context docs.From the chat interface then you are just referring to those not just describing features willy nilly. So the nice thing is you are building documentation for the application in a formal sense as part of instructing the LLM.

	▲	embedding-shape 11 hours ago \| parent [-]
		But that is the typical agentic LLM coder style program I was initially referring to, saying we maybe should explore other alternatives to. It's too basic and primitive, with some imagination.

▲

mhast 10 hours ago | parent | prev [-]

The typical "best practice" for these tools tend to be to ask it something like

"I want you to do feature X. Analyse the code for me and make suggestions how to implement this feature."

Then it will go off and work for a while and typically come back after a bit with some suggestions. Then iterate on those if needed and end with.

"Ok. Now take these decided upon ideas and create a plan for how to implement. And create new tests where appropriate."

Then it will go off and come back with a plan for what to do. And then you send it off with.

"Ok, start implementing."

So sure. You probably can work on this to make it easier to use than with a CLI chat. It would likely be less like an IDE and more like a planning tool you'd use with human colleagues though.

▲ troyvit 14 hours ago | parent | prev [-]

Aider can be a chat interface and it's great for that but you can also use it from your editor by telling it to watch your files.[1]

So you'd write a function name and then tell it to flesh it out.

  function factorial(n) // Implement this. AI!

Becomes:

  function factorial(n) {
    if (n === 0 || n === 1) {
      return 1;
    } else {
      return n \* factorial(n - 1);
    }
  }

Last I looked Aider's maintainer has had to focus on other things recently, but aider-ce is a fantastic fork.

I'm really curious to try Mistral's vibe, but even though I'm a big fanboi I don't want to be tied to just one model. Aider lets tier your models such that your big, expensive model can do all the thinking and then stuff like code reviews can run through a smaller model. It's a pretty capable tool

Edit: Fix formatting

[1] https://aider.chat/docs/usage/watch.html

	▲	zmmmmm 13 hours ago \| parent [-]
		> I don't want to be tied to just one model. Very much this for me - I really don't get why, given a new models are popping out every month from different providers, people are so happy to sink themselves into provider ecosystems when there are open source alternatives that work with any model. The main problem with Aider is it isn't agentic enough for a lot of people but to me that's a benefit.

▲ andai 18 hours ago | parent | prev | next [-]

I created a very unprofessional tool, which apparently does what you want!

While True:

0. Context injected automatically. (My repos are small.)

1. I describe a change.

2. LLM proposes a code edit. (Can edit multiple files simultaneously. Only one LLM call required :)

3. I accept/reject the edit.

▲ chrsw 17 hours ago | parent | prev | next [-]

> run locally for agentic coding. Nowadays I mostly use GPT-OSS-120b for this

What kind of hardware do you have to be able to run a performant GPT-OSS-120b locally?

▲

embedding-shape 16 hours ago | parent | next [-]

RTX Pro 6000, ends up taking ~66GB when running the MXFP4 native quant with llama-server/llama.cpp and max context, as an example. Guess you could do it with two 5090s with slightly less context, or different software aimed at memory usage efficiency.

	▲	kristianp 13 hours ago \| parent [-]
		That has 96GB GDDR7 ECC, to save people looking it up.

▲

fgonzag 16 hours ago | parent | prev | next [-]

The model is 64GB (int4 native), add 20GB or so for context.

There are many platforms out there that can run it decently.

AMD strix halo, Mac platforms. Two (or three without extra ram) of the new AMD AI Pro R9700 (32GB of RAM, $1200), multi consumer gpu setups, etc.

▲

FuckButtons 14 hours ago | parent | prev [-]

Mbp 128gb.

▲ true2octave 12 hours ago | parent | prev [-]

High quality code is a thing from the past

What matters is high quality specifications including test cases

▲

embedding-shape 12 hours ago | parent | next [-]

> High quality code is a thing from the past

Says the person who will find themselves unable to change the software even in the slightest way without having to large refactors across everything at the same time.

High quality code matters more than ever, would be my argument. The second you let the LLM sneak in some quick hack/patch instead of correctly solving the problem, is the second you invite it to continue doing that always.

	▲	bigiain 10 hours ago \| parent [-]
		I dunno... I have a feeling this will only supercharge the long established industry practice of new devs or engineering leadership getting recruited and immediately criticising the entire existing tech stack, and pushing for (and often succeeding) a ground up rewrite in language/framework de jour. This is hilariously common in web work, particularly front end web work. I suspect there are industry sectors that're well protected from this, I doubt people writing firmware for fuel injection and engine management systems suffer too much from this, the Javascript/Nodejs/NPM scourge _probably_ hasn't hit the PowerPC or 68K embedded device programming workflow. Yet...

▲

bigiain 10 hours ago | parent | prev [-]

"high quality specifications" have _always_ been a thing that matters.

In my mind, it's somewhat orthogonal to code quality.

Waterfall has always been about "high quality specifications" written by people who never see any code, much less write it. Agile make specs and code quality somewhat related, but in at least some ways probably drives lower quality code in the pursuit of meeting sprint deadlines and producing testable artefacts at the expense of thoroughness/correctness/quality.