There is also risk from a US regulatory side as recent drama around antrophic showed.

Don’t think it’s inconceivable that the clowns in power decide to limit api access out of the blue one day because someone whispered a conspiracy theory in someone’s ear. API blockade…

See also the constant flip flopping on what cards NVIDIA can export - no consistency in stance or coherent policy

▲

tbrownaw 3 hours ago | parent | next [-]

You are conflating three very different things.

The thing with Anthropic and the military was about whether vendors can tell the military what operations it's permitted to do. It has no bearing on the commercial sector, and isn't actually about AI.

The thing with NVIDIA cards is a continuation of how we've restricted tech exports for quite a while. You can find old news articles about game consoles being export-restricted over nuclear proliferation concerns. This AI-related one was about whether or not custom AI models are relevant to national security, and whether restricting graphics card sales can have a meaningful impact on them.

Any issue with selling chat tokens internationally would be more akin to the recent tariff shenanigans.

▲

trvz 5 hours ago | parent | prev | next [-]

Changing your LLM inference provider is the easiest switch in technology I can think of. It's quicker than taking off the case of your phone and putting on a new one.

Enough hardware and good models exist now that if you do get blocked from one place that viable alternatives do exist.

▲

Havoc 4 hours ago | parent | next [-]

> Changing your LLM inference provider is the easiest switch in technology I can think of.

Thats true right up until you’re working with confidential info in a corporate context. Then it’s a multi month cross discipline cross jurisdiction project not an edit in a config file.

▲

mring33621 4 hours ago | parent [-]

L O C A L M O D E L S

All data stays on computers that you control.

Same API. Localhost.

▲

mring33621 4 hours ago | parent [-]

Try Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC_Q4_k_m.gguf. This 7.5GB model runs well in llama.cpp on my 2021 Macbook Pro and is good at both coding and business document analysis tasks.

▲

NekkoDroid 2 hours ago | parent [-]

> Try Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC_Q4_k_m.gguf.

Thiss sounds like such a shitpost I initially thought you were joking... but this seems to be a real model???

▲

cpburns2009 22 minutes ago | parent | next [-]

There's a method to the madness:

- Mistral-Nemo: the actual model developed by Mistral and Nvidia.

- 2407: likely the release date of the base model, July of 2024.

- 12B: the model has 12 billion parameters.

- Thinking: the model operates in thinking mode (generates output plan and injests it before producing actual output).

- Claude-Gemini-GPT5.2: I think this means the model was finetuned with session data from Claude, Gemini, and GTP5.2 to replicate their behavior.

- Uncensored-HERITIC: the model was uncensored using the automated Heretic method.

- Q4_k_m: the model is quantized (lossy compression) to ~5 bpw from orignal 16 bpw.

	▲	NekkoDroid 13 minutes ago \| parent [-]
		Yea, I know what the parts individually mean. I just meant as a whole it just seemed so obsurd.

▲

mring33621 2 hours ago | parent | prev [-]

It is! I like to try the variations from possibly 'interesting' people.

Some of them are good. Others randomly break into gibberish and Chinese poetry(?).

▲

5 hours ago | parent | prev [-]

[deleted]

▲

notTheLastMan 6 hours ago | parent | prev [-]

[dead]