new | show | ask | jobs Github

Nition 20 hours ago

You guys have got stuck arguing without clarity in what you're arguing about. Let me try and clear this up...

The four potential scenarios:

- Mild prompt only ("no orange cats")

- Strong prompt only ("no orange cats or people die") [I think habinero is actually arguing against this one]

- Physical block + mild prompt [what I suggested earlier]

- Physical block + strong prompt [I think this is what you're actually arguing for]

Here are my personal thoughts on the matter, for the record:

I'm definitely pro combining physical block with strong prompt if there is actually a risk of people dying. The scenario where there's no actual risk but pretending that people will die improves the results I'm less sure about. But I think it's mostly that ethically I just don't like lying, and the way it's kind of scaring the LLM unnecessarily. Maybe that's really silly and it's just a tool in the end and why not do whatever needs doing to get the best results from the tool? Tools that act so much like thinking feeling beings are weird tools.

▲

habinero 19 hours ago | parent [-]

It's just a pile of statistics. It isn't acting like a feeling thing, and telling it "do this or people will die" doesn't actually do anything.

It feels like it does, but only because humans are really good about fooling ourselves into seeing patterns where there are none.

Saying this kind of prompt changes anything is like saying the horse Clever Hans really could do math. It doesn't, he couldn't.

It's incredibly silly to think you can make the non-deterministic system less non-deterministic by chanting the right incantation at it.

It's like y'all want to be fooled by the statistical model. Has nobody ever heard of pareidolia? Why would you not start with the null hypothesis? I don't get it lol.

▲

RamRodification 19 hours ago | parent [-]

> "do this or people will die" doesn't actually do anything

The very first message you replied to in this thread described a situation where "the prompt with the threat gives me 10% more usable results". If you believe that the premise is impossible I don't understand why you didn't just say so. Instead of going on about it not being a reliable method.

If you really think something is impossible, you don't base your argument on it being "unreliable".

> I don't get it lol.

I think you are correct here.

	▲	Nition 18 hours ago \| parent \| next [-]
		I took that comment as more like "it doesn't have any effect beyond the output of the model", i.e. unlike saying something like that to a human, it doesn't actually make the model feel anything, the model won't spread the lie to its friends, and so on.
	▲	habinero 21 minutes ago \| parent \| prev [-]
		Nah, you're managing to miss the point going both ways lol. It's both. Let's assume for the sake of argument that your statement is true, that you do, in fact, somehow get 10% more useful results. The two points are: 1. That doesn't make the system better in any way lol. You've built a tool that acts like a slot machine and only works if you get three cherries. Increasing the odds on cherries doesn't change the fact that using a slot machine as a UI is a ridiculous way to work. 2. In the real world, LLMs don't think. They do not use logic. They just churn out text in non-deterministic ways in response to input. They are not reliable and cannot be made so. Anybody who thinks they can is fooling / Clever Hansing themselves. The point here is you might feel like the system is 10% more useful, but it feels like that because human brains have some hardware bugs.