Remix.run Logo
qarl 2 hours ago

The real advantage of the Chinese models is that they do not phone home at all. They run locally unlike their US competitors.

So odd that your erroneous criticism is at the top of HN.

EDIT: I'd love to hear my downvoters' objections. Is it possible that the mechanism that is promoting erroneous information is also demoting its correction?

kube-system 2 hours ago | parent [-]

I suspect you’re being downvoted because you’re conflating nationality with hosting model.

There are hosted and self-hosted Chinese models. There are hosted and self-hosted US models.

DeepSeek’s hosted offering processes your data in mainland China and trains on it. It’s in their privacy policy

qarl 2 hours ago | parent [-]

Well - yes - we're on the internet. You always have a choice to run your software in foreign countries.

But it's still erroneous to claim that it isn't a choice.

kube-system 2 hours ago | parent [-]

The most popular frontier models are not open weight.

qarl 2 hours ago | parent [-]

The model we're discussing (Deepseek) is open weight.

kube-system an hour ago | parent [-]

Perhaps your prior comment would’ve been better received if it said that specifically instead of “Chinese models”.

But also, the latest DeepSeek is 1.6T parameters. “Choosing” to run this locally is a choice that comes with a seven digit price tag, and is a sunk cost that will probably not run any other frontier model anytime soon.

Most organizations are not looking to spend millions of dollars trying to find a workaround to specifically run DeepSeek. Most enterprise consumption in this space is still very experimental and a pay as you go model is much more palatable. Most are simply just looking for three checkboxes: is it close to frontier performance, is it compliant with my organizations requirements, and is it a good price? DeepSeek can only do two of the three at the same time.

zozbot234 an hour ago | parent | next [-]

> But also, the latest DeepSeek is 1.6T parameters. “Choosing” to run this locally is a choice that comes with a seven digit price tag

Unless you're specifically thinking about running the model at stock precision in a datacenter environment and generating ~100 tok/s or more on a 24/7 basis (the equivalent of a >$1000/mo spend even on the cheapest third-party APIs), that's very likely off by multiple orders of magnitude. Even then, experimentation can be done with cheap neoclouds on a pay-as-you-go basis.

kube-system an hour ago | parent [-]

I’m aware. The context of the discussion here is choosing DeepSeek over a US hosted model from Google, Anthropic or OpenAI.

The equivalent comparison would be running it at full frontier quality.

If you want less than frontier quality, there’s tons of great open weight models other than DeepSeek.

> cheap neoclouds

Again, fails the compliance checkbox.

zozbot234 31 minutes ago | parent | next [-]

> Again, fails the compliance checkbox.

OK, then the not-so-cheap hyperscalers that these enterprises are already relying on. E.g. AWS Bedrock will run these models. It's silly to insist on all three of your checkboxes being ticked anyway - U.S. proprietary models don't give you that because the frontier ones are super expensive and the mini models have only barely acceptable cost.

qarl 33 minutes ago | parent | prev | next [-]

Azure serves DeepSeek V4 Pro, about 10X cheaper than GPT-5.5.

qarl an hour ago | parent | prev [-]

[flagged]

qarl an hour ago | parent | prev [-]

My most sincere apologies for shortening "the vast majority of Chinese models" to simply "Chinese models".

I can see now why I was being downvoted - you have explained it eloquently.

(Your cost analysis is flawed and irrelevant. Azure serves V4 Pro.)