Remix.run Logo
Mashimo 5 hours ago

Compared to all other hosted LLMs that I have tested, Mistral seems to be the only one with rather strict CSP headers. When you ask them to create a website with some javascript library it will not preview, even though le chat offers canvas mode.

Sometimes when a new release comes around from any provider I just want to test it a bit on the web. without paying and using an agent harness.

Why are they like this ;_;

Edit: Christ on a bike it's bad at drawing SVGs https://chat.mistral.ai/chat/23214adb-5530-4af9-bb47-90f5219...

SyneRyder 4 hours ago | parent | next [-]

> Edit: Christ on a bike it's bad at drawing SVGs

On the bike would be an improvement. Geez.

I know SVGs may not be the best benchmark, but that matches my experience of trying to run a (previous) Mistral model in Mistral Vibe, asking it to help me configure an MCP server in Vibe. It confidently explained that MCP is the MineCraft Protocol and then began a search of my computer looking for Minecraft binaries.

2ndorderthought 5 hours ago | parent | prev [-]

I have never wanted, needed or hoped to draw svgs with an LLM. All of the models suck at it, some are just more fun or something.

andai 13 minutes ago | parent | next [-]

Claude volunteered this the other day:

https://iili.io/BsfyNXR.jpg

(I think the hair was unintentional, but it is impossible to be sure.)

Mashimo 4 hours ago | parent | prev [-]

I can't speak for what you consider sucking, but there is a significant difference between Mistral and Kimi or Gemini. I find the others to be usable for my needs.

2ndorderthought 4 hours ago | parent [-]

I agree there is a difference but does that translate to anything? It's not the same operations used to write code, and it's kind of useless. I wouldn't waste my power bill ensuring a model I was releasing was good at it.

Mashimo 3 hours ago | parent [-]

> It's not the same operations used to write code

Is it not? It's html and javascript. And not even attempting to draw details that other models do.

When I try other html / js prompts it also lacks behind china models from over half a year ago. I mean worse then GLM 4.7.

2ndorderthought 3 hours ago | parent [-]

I've only tried it via the web so far and it's working great for my usual test prompts. Looks like it was trained on some pretty recent data.