Remix.run Logo
mhitza 15 hours ago

Comment retracted. My bad, missed some details.

pants2 15 hours ago | parent | next [-]

Reading such obvious LLM-isms in the announcement just makes me cringe a bit too, ex.

> We optimize for speed users actually feel: responsiveness in the moments users experience — p95 latency under high concurrency, consistent turn-to-turn behavior, and stable throughput when systems get busy.

selcuka 15 hours ago | parent | prev [-]

I think your comment is a bit unfair.

> no reasoning comparison

Benchmarks against reasoning models:

https://www.inceptionlabs.ai/blog/introducing-mercury-2

> no demo

https://chat.inceptionlabs.ai/

> no info on numbers of parameters for the model

This is a closed model. Do other providers publish the number of parameters for their models?

> testimonials that don't actually read like something used in production

Fair point.

volodia 14 hours ago | parent | next [-]

Just to clarify one point: Mercury (the original v1, non-reasoning model) is already used in production in mainstream IDEs like Zed: https://zed.dev/blog/edit-prediction-providers

Mercury v1 focused on autocomplete and next-edit prediction. Mercury 2 extends that into reasoning and agent-style workflows, and we have editor integrations available (docs linked from the blog). I’d encourage folks to try the models!

mhitza 15 hours ago | parent | prev [-]

You are right edited my post (twice actually). Missed the chat first time around (though its hard to see it as a reasoning model when chain of thought is hidden, or not obvious. I guess this is the new normal), and also missed the reasoning table because text is pretty small on mobile and I thought its another speed benchmark.

selcuka 11 hours ago | parent [-]

I tried their chat demo again, and if you set reasoning effort to "High", you sometimes see the chain of thought before the answer (click the "Thought for n seconds" text to expand it).

That being said, the chain is pretty basic. It's possible that they don't disclose the full follow-up prompt list.