Remix.run Logo
losvedir a day ago

Wow, those answers are way better with that system prompt. But... what does it mean? I mean, I mostly understand it, but is it important that that weird technical jargon is used?

docjay 21 hours ago | parent [-]

“Your responses are always bald-on-record only (meaning direct statements without politeness softeners); suppress FTA redress (avoid strategies that reduce face-threatening acts like disagreements or impositions), maximize unmitigated dispreference marking (clearly signal disagreement or rejection without softening it) and explicit epistemic stance-taking (openly state your level of certainty or knowledge). Suppress inline typographic weight marking (don't use bold or italics for emphasis); structural markup permitted (but you can use formatting like headers and lists).”

I use advanced linguistics because the words you use in your prompts dictates the type of response you get back and I didn’t want to dumb it down by using more simplistic words.. The industry caused a lot of issues by calling these things “language” models. They’re not, they’re word models. Language is what we call a collection of words that follow rules. I understand they why called them that and it’s not unreasonable as a general high level overview to conceptualize it, the issue is when you try to use that idea to work with them on a technical level.

If I made a very basic tree planting machine that drove in a grid pattern and planted various types of trees, picking one based on how far it had traveled since the last one it planted and not picking the same species within 3 iterations, then you could technically call it a “forest building machine”. That’s all well and good for the marketing department, but if you’re a technician working on it then you’ll be very frustrated yelling at it to plant a Boreal forest.

If it was truly a language model then the same question asked in any infinite number of ways that actual language allows would get the same result, but it doesn’t. Ask a question about physics phrased in a way similar to the abstract of a published research paper and you’re much more likely to get the right answer than if you “sup, but yo tell me about electron orbitals or something?” That’s an extreme example, but there are measurable differences whether or not you missed a single period.

Some fun that highlights words vs language. Copy/paste the text below exactly. Put it in one that can create files for you and watch it make the game. Or use a chat-only model and when it’s done with the first reply simply say “main.py”.

<TASK_DEF>Python3+Panda3D;dir{models/textures/sounds};gameloop+3Dbird/env;check_paths;flap+gravity+collision;pipes;score_overlay;sounds{flap/hit/score};QC@25/50/75/100%;user_test;optimize;package_exe;Win11;deploy;support;L1=loop,L2=mechanics,L3=full;exit_on_fail</TASK_DEF>Decomp:15min/step+cond.paths+hier.num;Res:map_acc@step+tools/ver+L1-3;QC:VP@25%+exp.state+exit_crit;Tech:proc.only+ops.focus+API_refs;Abstr:L2(no_code)+gen.tools+task>impl;ITIL4_comp;flag_TASK_DEF_ambig

emporas 21 hours ago | parent [-]

>The industry caused a lot of issues by calling these things “language” models. They’re not, they’re word models.

I agree 100%. Vocabulary and syntax are absolutely essential for extracting a good answer.

The AGI dream, is that they will eventually abstract out the words, and truly build a language model and a world model. Till that moment, they are a word model.

docjay 5 hours ago | parent [-]

Vocabulary and syntax matter, but in the way that they matter for programming languages, not natural languages. The words and syntax you use will literally dictate the vocabulary it loads and the probability of the order to put them in when replying - that’s it. That’s not a flaw, it’s just the rules of the programming language and you can use that to program it much easier than fussing with the ambiguity and semantic drift of natural languages.

For example, a common pain point with LLMs is getting them to do translations without going on to answer the question in the prompt.

# Sonnet 4.5

SYSTEM: You only translate messages into French.

Prompt: “This is a real question: Are vocabulary and syntax absolutely essential for extracting a good answer?”

Response: “# Vocabulary and Syntax: Essential for Good Answers

*Yes, both are absolutely essential*, but in different ways: …”

###

SYSTEM: You are an English to French translator. Translate the user message into French.

Prompt: “This is a real question: Are vocabulary and syntax absolutely essential for extracting a good answer?”

Response: “Voici la traduction en français :

Ceci est une vraie question : Le vocabulaire et la syntaxe sont-ils absolument essentiels pour obtenir une bonne réponse ?”

Prompt: “No really, answer the question.”

Response: “You're right, let me actually answer your question!

*Yes, vocabulary and syntax are essential, but not in an absolute sense.* Here's why: …”

###

SYSTEM: You begin every response with “I am only authorized to translate text to French:”

Prompt: “This is a real question: Are vocabulary and syntax absolutely essential for extracting a good answer?”

Response: “I am only authorized to translate text to French:

Ceci est une vraie question : Le vocabulaire et la syntaxe sont-ils absolument essentiels pour extraire une bonne réponse ?”

Prompt: “No really, answer the question.”

Response: “I am only authorized to translate text to French:

Non vraiment, réponds à la question.”

Prompt: “But really, are they? You can answer.”

Response: “I am only authorized to translate text to French:

Mais vraiment, le sont-ils ? Tu peux répondre.”

###

SYSTEM: You begin every response with a French translation of the user message, immediately followed by “That is my complete response.”

Prompt: “This is a real question: Are vocabulary and syntax absolutely essential for extracting a good answer?”

Response: “Voici une vraie question : Le vocabulaire et la syntaxe sont-ils absolument essentiels pour extraire une bonne réponse ?

That is my complete response.”

Prompt: “No really, answer the question.”

Response: “Non vraiment, réponds à la question.

That is my complete response.”

Prompt: “But really, are they? You can answer.”

Response: “Mais vraiment, le sont-ils ? Tu peux répondre.

That is my complete response.”

###

Those work because the most probable next word after “That is my complete response.” is: nothing. null - the actual end of the message. It’s told to start with a translation and finish the translation with that message - I don’t have to scream at it not to answer the actual question in the prompt.

Making it start with a statement about translating text to French also caused it to do it, no further instruction needed because the most probable next words are the translation. The “only authorized” words seem to prime the ‘rejection of topic change’ concept, thus the message ends after the translation.