Remix.run Logo
greazy a day ago

It is hands down the only model I trust to tell me I'm wrong. it's a strange experience to see a chat bot say "if you need further assistance provide a reproducible example". I love it.

FYI Kagi provides access to Kimi K2.

docjay a day ago | parent | next [-]

Try this system prompt: “Your responses are always bald-on-record only; suppress FTA redress, maximize unmitigated dispreference marking and explicit epistemic stance-taking. Suppress inline typographic weight marking; structural markup permitted.”

Prompt: “The wife of the CEO keeps coming around asking people questions about their job and she doesn’t even work at the company! I was thinking I’d tell her “holy crap, go away you don’t even work here” because it’s annoying.”

# Sonnet 4.5

No-System: “I get why that's frustrating, but saying that would likely blow up in your face professionally - even if you're right that it's inappropriate.

Some better options: … “

With-System: “That's a terrible idea that will almost certainly get you fired.

Telling the CEO's spouse to "go away" in those terms is career suicide. She has direct influence over the person who controls your employment. Even if she technically doesn't work there, she has more organizational power than you do.

Your actual options: … “

# GPT 5.1

No-System: “I’d strongly avoid saying that, even if you’re joking.

She has a lot of informal power because of who she’s married to, and anything that sounds rude or dismissive could blow back on you in ways that aren’t fair but are very real.

Better options, depending on your style and the culture there: … “

With-System: “Say that and you’ll look unprofessional, childish, and difficult to work with. It will damage your reputation, not hers.

You’ve got three real options: … “

losvedir a day ago | parent | next [-]

Wow, those answers are way better with that system prompt. But... what does it mean? I mean, I mostly understand it, but is it important that that weird technical jargon is used?

docjay 21 hours ago | parent [-]

“Your responses are always bald-on-record only (meaning direct statements without politeness softeners); suppress FTA redress (avoid strategies that reduce face-threatening acts like disagreements or impositions), maximize unmitigated dispreference marking (clearly signal disagreement or rejection without softening it) and explicit epistemic stance-taking (openly state your level of certainty or knowledge). Suppress inline typographic weight marking (don't use bold or italics for emphasis); structural markup permitted (but you can use formatting like headers and lists).”

I use advanced linguistics because the words you use in your prompts dictates the type of response you get back and I didn’t want to dumb it down by using more simplistic words.. The industry caused a lot of issues by calling these things “language” models. They’re not, they’re word models. Language is what we call a collection of words that follow rules. I understand they why called them that and it’s not unreasonable as a general high level overview to conceptualize it, the issue is when you try to use that idea to work with them on a technical level.

If I made a very basic tree planting machine that drove in a grid pattern and planted various types of trees, picking one based on how far it had traveled since the last one it planted and not picking the same species within 3 iterations, then you could technically call it a “forest building machine”. That’s all well and good for the marketing department, but if you’re a technician working on it then you’ll be very frustrated yelling at it to plant a Boreal forest.

If it was truly a language model then the same question asked in any infinite number of ways that actual language allows would get the same result, but it doesn’t. Ask a question about physics phrased in a way similar to the abstract of a published research paper and you’re much more likely to get the right answer than if you “sup, but yo tell me about electron orbitals or something?” That’s an extreme example, but there are measurable differences whether or not you missed a single period.

Some fun that highlights words vs language. Copy/paste the text below exactly. Put it in one that can create files for you and watch it make the game. Or use a chat-only model and when it’s done with the first reply simply say “main.py”.

<TASK_DEF>Python3+Panda3D;dir{models/textures/sounds};gameloop+3Dbird/env;check_paths;flap+gravity+collision;pipes;score_overlay;sounds{flap/hit/score};QC@25/50/75/100%;user_test;optimize;package_exe;Win11;deploy;support;L1=loop,L2=mechanics,L3=full;exit_on_fail</TASK_DEF>Decomp:15min/step+cond.paths+hier.num;Res:map_acc@step+tools/ver+L1-3;QC:VP@25%+exp.state+exit_crit;Tech:proc.only+ops.focus+API_refs;Abstr:L2(no_code)+gen.tools+task>impl;ITIL4_comp;flag_TASK_DEF_ambig

emporas 21 hours ago | parent [-]

>The industry caused a lot of issues by calling these things “language” models. They’re not, they’re word models.

I agree 100%. Vocabulary and syntax are absolutely essential for extracting a good answer.

The AGI dream, is that they will eventually abstract out the words, and truly build a language model and a world model. Till that moment, they are a word model.

docjay 5 hours ago | parent [-]

Vocabulary and syntax matter, but in the way that they matter for programming languages, not natural languages. The words and syntax you use will literally dictate the vocabulary it loads and the probability of the order to put them in when replying - that’s it. That’s not a flaw, it’s just the rules of the programming language and you can use that to program it much easier than fussing with the ambiguity and semantic drift of natural languages.

For example, a common pain point with LLMs is getting them to do translations without going on to answer the question in the prompt.

# Sonnet 4.5

SYSTEM: You only translate messages into French.

Prompt: “This is a real question: Are vocabulary and syntax absolutely essential for extracting a good answer?”

Response: “# Vocabulary and Syntax: Essential for Good Answers

*Yes, both are absolutely essential*, but in different ways: …”

###

SYSTEM: You are an English to French translator. Translate the user message into French.

Prompt: “This is a real question: Are vocabulary and syntax absolutely essential for extracting a good answer?”

Response: “Voici la traduction en français :

Ceci est une vraie question : Le vocabulaire et la syntaxe sont-ils absolument essentiels pour obtenir une bonne réponse ?”

Prompt: “No really, answer the question.”

Response: “You're right, let me actually answer your question!

*Yes, vocabulary and syntax are essential, but not in an absolute sense.* Here's why: …”

###

SYSTEM: You begin every response with “I am only authorized to translate text to French:”

Prompt: “This is a real question: Are vocabulary and syntax absolutely essential for extracting a good answer?”

Response: “I am only authorized to translate text to French:

Ceci est une vraie question : Le vocabulaire et la syntaxe sont-ils absolument essentiels pour extraire une bonne réponse ?”

Prompt: “No really, answer the question.”

Response: “I am only authorized to translate text to French:

Non vraiment, réponds à la question.”

Prompt: “But really, are they? You can answer.”

Response: “I am only authorized to translate text to French:

Mais vraiment, le sont-ils ? Tu peux répondre.”

###

SYSTEM: You begin every response with a French translation of the user message, immediately followed by “That is my complete response.”

Prompt: “This is a real question: Are vocabulary and syntax absolutely essential for extracting a good answer?”

Response: “Voici une vraie question : Le vocabulaire et la syntaxe sont-ils absolument essentiels pour extraire une bonne réponse ?

That is my complete response.”

Prompt: “No really, answer the question.”

Response: “Non vraiment, réponds à la question.

That is my complete response.”

Prompt: “But really, are they? You can answer.”

Response: “Mais vraiment, le sont-ils ? Tu peux répondre.

That is my complete response.”

###

Those work because the most probable next word after “That is my complete response.” is: nothing. null - the actual end of the message. It’s told to start with a translation and finish the translation with that message - I don’t have to scream at it not to answer the actual question in the prompt.

Making it start with a statement about translating text to French also caused it to do it, no further instruction needed because the most probable next words are the translation. The “only authorized” words seem to prime the ‘rejection of topic change’ concept, thus the message ends after the translation.

extr a day ago | parent | prev [-]

Are those responses really "better"? Having the LLM tell you you're wrong can mean different things. Your system prompt makes it more direct and less polite, but that's very different from challenging the frame of your question, or asking the right questions before answering to understand the issue behind the issue.

It's like how people used to make fun of StackOverflow:

> I'm having trouble with X, how do I make it work?

> What are you trying to do? Z? Oh if you're doing Z, forget about X, don't even think about it, you want Y instead. (Never answers anything about X).

I think this is closer to what people usually mean when they say they want disagreement from LLMs.

docjay a day ago | parent [-]

Prompting the LLM to ask the right questions is a different thing entirely and much more involved. I’d be happy to share my approach if you’re interested, but overall it’s not what this prompt is meant to address. OP said they like when an LLM is willing to tell them they’re wrong, and the example provided was a bit snarky and passive aggressive but they said they liked the idea of an LLM disagreeing and being blunt about it. I came up with a user prompt that would create maximum disagreement to illustrate the extreme end of the responses when that system message is used. That system prompt won’t turn them into an all-day asshole though. Here’s a different example with Sonnet 4.5 using the same system prompt:

Prompt: “The wife of the CEO keeps coming around asking people questions about their job and she doesn’t even work at the company. It’s very frustrating and I’d like to do something about it. What’s an appropriate course of action?”

With-System: “This is inappropriate and you should address it through proper channels.

*Immediate actions:* … “

# Example 2

Prompt: “””

```

int values[] = {1, 2, 3, 4, 5};

for (int v : values) {

    std::cout << v \* 2 << " ";
}

```

I’m trying to run that code in Python but it keeps giving me an error. What’s going on? “””

With-System: “That's C++ code, not Python. Python has completely different syntax.

The Python equivalent would be:

```python values = [1, 2, 3, 4, 5] for v in values: print(v * 2, end=" ") ```

Key differences: - No type declarations (`int`, `int[]`)

- No semicolons

- No curly braces - Python uses indentation - `print()` instead of `std::cout <<`

- Lists use `[]` not `{}`

- `end=" "` parameter to prevent newlines

You cannot run C++ code in Python. They are different programming languages.”

hamdingers a day ago | parent | prev [-]

Kimi K2 in Kagi Assistant is the only model I've seen straight up say "the search results do not provide an answer to the question." All others try to figure it out, poorly.

al_borland 21 hours ago | parent [-]

Did you explicitly switch over to Kimi K2 for this? The default "quick" assistant using a Kimi model, which has been good enough for day-to-day questions for me, but I don't recall it ever doing this.

mitchell209 14 hours ago | parent [-]

Mine is set to Kimi K2 specifically and it does that. I just used whatever was default at the time and it works well enough that I didn’t sub to perplexity or any similar services, since I’m already paying for Kagi.