Thanks. Mind providing screenshots? I believe you, I just think this helps. Your comments align with some of my other responses. I'm not trying to make hard claims here and I'm willing to believe the result is not nefarious. But it's still worth investigating. In the weakest form it's worth being aware of how laws in other countries impact ours, right?

But I don't think we should talk about explanation until we can even do some verification. At this point I'm not entirely sure. We still have the security question open and I'm asking for help because I'm not a security person. Shouldn't we start here?

▲

natrys 4 days ago | parent [-]

If you mean the bit about refusal from other models, then sure here is another run with same result:

https://i.postimg.cc/6tT3m5mL/screen.png

Note I am using direct API to avoid triggering separate guardrail models typically operating in front of website front-ends.

As an aside the website you used in your original comment:

> [2] Used this link https://www.deepseekv3.net/en/chat

This is not the official DeepSeek website. Probably one of the many shady third-party sites riding on DeepSeek name for SEO, who knows what they are running. In this case it doesn't matter, because I already reproduced your prompt with a US based inference provider directly hosting DeepSeek weights, but still worth noting for methodology.

(also to a sceptic screenshots shouldn't be enough since they are easily doctored nowadays, but I don't believe these refusals should be surprising in the least to anyone with passing familiarity with these LLMs)

---

Obviously sabotage is a whole another can of worm as opposed to mere refusal, something that this article glossed over without showing their prompts. So, without much to go on, it's hard for me to take this seriously. We know garbage in context can degrade performance, even simple typos can[1]. Besides LLMs at their present state of capabilities are barely intelligent enough to soundly do any serious task, it stretches my disbelief that they would be able to actually sabotage to any reasonable degree of sophistication - that said I look forward to more serious research on this matter.

[1] https://arxiv.org/abs/2411.05345v1

▲

godelski 4 days ago | parent [-]

I want to clarify that I'm not trying to make strong claims. That's why I'm asking for others to post and why I'm grateful you did. I think that helps us get to the truth of the matter. I also agree with your criticisms of the link I used, but to be frank, I'm not going to pay for just this test. That's why I wanted to be open and clear about how I obtained the information. I was hoping someone that already paid would confirm or deny my results.

With your Hamas example, I think it is beside the point. I apologize as I probably didn't make my point clearer. Mainly I wanted to stop baseless accusations and find the reality, since the articles claims are testable. But what I don't want to make a claim if is why this is happening. In another comment I even said that this could happen because they were suppressing this group. So I wouldn't be surprised if the same is true for Hamas. We can't determine if it's an intentional sleeper agent or just a result of censorship. But either way it is concerning, right? The unintentional version might be more concerning because we don't know what is being censored and what isn't. These censorships cross country lines and it is hard to know what is being censored and what isn't.

So I'm not trying to make a "Murica good, China bad" argument. I'm trying to make a "let's try to verify or discredit the claims." I want HN to be more nuanced. And I do seriously appreciate you engaging and with more depth and nuance than others. I'm upvoting you even though we disagree because I think your comments are honest and further the discussion.

▲

vitorgrs 3 days ago | parent [-]

DeepSeek chat it's free... No need to pay to test, thought.

https://chat.deepseek.com/

You can also use the API directly for free on OpenRouter.

	▲	godelski 3 days ago \| parent [-]
		Needs a login, so I went around. Are you able to verify my results?