Remix.run Logo
locusofself 4 hours ago

Working at Microsoft, I've just now hooked up to Claude Code (my department was not permitted to use it previously), through something called "Agent Maestro", a vscode extension which I guess pipes claude code API requets to our internally hosted Claude models, including Opus 4.6.

I do wonder if there is going to be much of a difference between using Claude Code vs. Copilot CLI when using the same models.

nfg 4 hours ago | parent | next [-]

> I do wonder if there is going to be much of a difference between using Claude Code vs. Copilot CLI when using the same models.

I’m also at MS, not (yet?) using Claude Code at work and pondering precisely the same question.

cactusplant7374 4 hours ago | parent | prev | next [-]

Is this an indictment of OpenAI's models -- that Microsoft has access to through their investment?

locusofself 3 hours ago | parent [-]

We've had both GPT and Claude models available to us in Github Copilot for some time. At first, it was only GPT models.

pletnes 4 hours ago | parent | prev | next [-]

I honestly don’t think the models are as important as people tend to believe. More important is how the models are given tools - find, grep, git, test runners, …

Galanwe 3 hours ago | parent [-]

> I honestly don’t think the models are as important as people tend to believe.

I tend to disagree. While I don't see meaningful _reasoning power_ between frontier models, I do see differences in the way they interact with my prompts.

I use exclusively Anthropic models because my interactions with GPT are annoying:

- Sonnet/Opus behave like a mix of a diligent intern, or a peer. It does the work, doesn't talk too much, gives answers, etc.

- GPT is overly chatty, it borderline calls me "bro", tend to brush issues I raise "it should be good enough for general use", etc.

- I find that GPT hardly ever steps back when diagnosing issues. It picks a possible cause, and enters a rabbit hole of increasingly hacky / spurious solutions. Opus/Sonnet is often to step back when the complexity increases too much, and dig an alternative.

- I find Opus/Sonnet to be "lazy" recently. Instead of systematically doing an accurate search before answering, it tries to "guess", and I have to spot it and directly tell it to "search for the precise specification and do not guess". Often it would tell me "you should do this and that", and I have to tell it "no, you do it". I wonder if it was done to reduce the number of web searches or compute that it uses unless the user explicitly asks.

0xbadcafebee 3 hours ago | parent | prev [-]

Compare their system prompts and the agent harness logic. It's 99% of what makes the agent useful, and it can be quite different.