They have an interesting regex for detecting negative sentiment in users prompt which is then logged (explicit content): https://github.com/chatgptprojects/claude-code/blob/642c7f94...

I guess these words are to be avoided...

▲ joeblau 2 hours ago | parent | next [-]

We used this in 2011 at the startup I worked for. 20 positive and 20 negative words was good enough to sell Twitter "sentiment analysis" to companies like Apple, Bentley, etc...

▲

vdfs 28 minutes ago | parent [-]

Did you also forget to ignore case sensitivity back then?

	▲	24 minutes ago \| parent \| next [-]
		[deleted]
	▲	adzm 18 minutes ago \| parent \| prev [-]
		the string is lowercased before the regex is run, fwiw

▲ amichal 8 minutes ago | parent | prev | next [-]

If this code is real and complete then there are no callers of those methods other than a logger line

▲ BoppreH 6 hours ago | parent | prev | next [-]

An LLM company using regexes for sentiment analysis? That's like a truck company using horses to transport parts. Weird choice.

▲

lopsotronic 2 hours ago | parent | next [-]

The difference in response time - especially versus a regex running locally - is really difficult to express to someone who hasn't made much use of LLM calls in their natural language projects.

Someone said 10,000x slower, but that's off - in my experience - by about four orders of magnitude. And that's average, it gets much worse.

Now personally I would have maybe made a call through a "traditional" ML widget (scikit, numpy, spaCy, fastText, sentence-transformer, etc) but - for me anyway - that whole entire stack is Python. Transpiling all that to TS might be a maintenance burden I don't particularly feel like taking on. And on client facing code I'm not really sure it's even possible.

▲

noprof6691 18 minutes ago | parent | next [-]

They're sending it to an llm anyway tho? Not sure why they wouldn't just add a sentiment field to the requested response shape.

	▲	FuckButtons 12 minutes ago \| parent [-]
		because a regex on the client is free vs gpu compute is absolutely not.

▲

cyanydeez 2 hours ago | parent | prev | next [-]

So, think of it as a business man: You don't really care if your customers swear or whatever, but you know that it'll generate bad headlines. So you gotta do something. Just like a door lock isn't designed for a master criminal, you don't need to design your filter for some master swearer; no, you design it good enough that it gives the impression that further tries are futile.

So yeah, you do what's less intesive to the cpu, but also, you do what's enough to prevent the majority of the concerns where a screenshot or log ends up showing blatant "unmoral" behavior.

▲

true_religion 2 hours ago | parent [-]

This door lock doesn’t even work against people speaking French, so I think they could have tried a mite harder.

	▲	bigbuppo 8 minutes ago \| parent \| next [-]
		There are only Americans on the internet.
	▲	sebastiennight an hour ago \| parent \| prev \| next [-]
		En toute honnêteté, je pense avoir dit "damn it" plus d'une fois à chat gépété avant de fermer la fenêtre dans un accès de rage
	▲	ben_w an hour ago \| parent \| prev [-]
		The up-side of the US market is (almost) everyone there speaks English. The down side is, that includes all the well-networked pearl-clutchers. Europe (including France) will have the same people, but it's harder to coordinate a network of pearl-clutching between some saying "Il faut protéger nos enfants de cette vulgarité!" and others saying "Η τηλεόραση και τα μέσα ενημέρωσης διαστρεβλώνουν τις αξίες μας!" even when they care about the exact same media. For headlines, that's enough. For what's behind the pearl-clutching, for what leads to the headlines pandering to them being worth writing, I agree with everyone else on this thread saying a simple word list is weird and probably pointless. Not just for false-negatives, but also false-positives: the Latin influence on many European languages leads to one very big politically-incorrect-in-the-USA problem for all the EU products talking about anything "black" (which includes what's printed on some brands of dark chocolate, one of which I saw in Hungary even though Hungarian isn't a Latin language but an Ugric language and only takes influences from Latin).

▲

mlmonkey 37 minutes ago | parent | prev [-]

> Someone said 10,000x slower, but that's off - in my experience - by about four orders of magnitude.

You do know that 10,000x _is_ four orders of magnitude, right? :-D

	▲	jonbwhite 28 minutes ago \| parent [-]
		OP is saying that in their experience it is more like eight orders of magnitude

▲

nojs an hour ago | parent | prev | next [-]

Oh it’s worse than that. This one ended up getting my account banned: https://github.com/anthropics/claude-code/issues/22284

	▲	lanbin 36 minutes ago \| parent \| next [-]
		This is a tricky problem, I mean, Pinyin also uses the English alphabet.
	▲	cryptonector an hour ago \| parent \| prev [-]
		Wow, that's horrible.

▲

stingraycharles 6 hours ago | parent | prev | next [-]

Because they want it to be executed quickly and cheaply without blocking the workflow? Doesn’t seem very weird to me at all.

▲

_fizz_buzz_ 5 hours ago | parent | next [-]

They probably have statistics on it and saw that certain phrases happen over and over so why waste compute on inference.

▲

crem 2 hours ago | parent | next [-]

More likely their LLM Agent just produced that regex and they didn't even notice.

▲

mycall 5 hours ago | parent | prev [-]

The problem with regex is multi-language support and how big the regex will bloat if you to support even 10 languages.

▲

doublesocket 4 hours ago | parent | next [-]

Supporting 10 different languages in regex is a drop in the ocean. The regex can be generated programmatically and you can compress regexes easily. We used to have a compressed regex that could match any placename or street name in the UK in a few MB of RAM. It was silly quick.

▲

astrocat 2 hours ago | parent | next [-]

woah. This is a regex use I've never heard of. I'd absolutely love to see a writeup on this approach - how its done and when it's useful.

	▲	benlivengood 2 hours ago \| parent [-]
		You can literally \| together every street address or other string you want to match in a giant disjunction, and then run a DFA/NFA minimization over that to get it down to a reasonable size. Maybe there are some fast regex simplification algorithms as well, but working directly with the finite automata has decades of research and probably can be more fully optimized.

▲

cogman10 2 hours ago | parent | prev [-]

I think it will depend on the language. There are a few non-latin languages where a simple word search likely won't be enough for a regex to properly apply.

▲

TeMPOraL 4 hours ago | parent | prev | next [-]

We're talking about Claude Code. If you're coding and not writing or thinking in English, the agents and people reading that code will have bigger problems than a regexp missing a swear word :).

▲

MetalSnake 4 hours ago | parent | next [-]

I talk to it in non-English. But have rules to have everything in code and documentation in english. Only speaking with me should use my native language. Why would that be a problem?

▲

ekropotin 3 hours ago | parent [-]

Because 90% of training data was in English and therefore the model perform best in this language.

▲

foldr 3 hours ago | parent [-]

In my experience these models work fine using another language, if it’s a widely spoken one. For example, sometimes I prompt in Spanish, just to practice. It doesn’t seem to affect the quality of code generation.

	▲	ekropotin an hour ago \| parent \| next [-]
		It’s just a subjective observation. It just can’t be a case simply because how ML works. In short, the more diverse and high quality texts with reasoning reach examples were in the training set, the better model performs on a given language. So unless Spanish subset had much more quality-dense examples, to make up for volume, there is no way the quality of reasoning in Spanish is on par with English. I apologise for the rambling explanation, I sure someone with ML expertise here can it explain it better.
	▲	adamsb6 3 hours ago \| parent \| prev [-]
		They literally just have to subtract the vector for the source language and add the vector for the target. It’s the original use case for LLMs.

▲

cryptonector an hour ago | parent | prev | next [-]

Claude handles human languages other than English just fine.

▲

formerly_proven 4 hours ago | parent | prev [-]

In my experience agents tend to (counterintuitively) perform better when the business language is not English / does not match the code's language. I'm assuming the increased attention mitigates the higher "cognitive" load.

▲

crimsonnoodle58 4 hours ago | parent | prev | next [-]

They only need to look at one language to get a statistically meaningful picture into common flaws with their model(s) or application.

If they want to drill down to flaws that only affect a particular language, then they could add a regex for that as well/instead.

▲

b112 4 hours ago | parent | prev [-]

Did you just complain about bloat, in anything using npm?

▲

Foobar8568 5 hours ago | parent | prev | next [-]

Why do you need to do it at the client side? You are leaking so much information on the client side. And considering the speed of Claude code, if you really want to do on the client side, a few seconds won't be a big deal.

	▲	plorntus 4 hours ago \| parent \| next [-]
		Depends what its used by, if I recall theres an `/insights` command/skill built in whatever you want to call it that generates a HTML file. I believe it gives you stats on when you're frustrated with it and (useless) suggestions on how to "use claude better". Additionally after looking at the source it looks like a lot of Anthropics own internal test tooling/debug (ie. stuff stripped out at build time) is in this source mapping. Theres one part that prompts their own users (or whatever) to use a report issue command whenever frustration is detected. It's possible its using it for this.
	▲	matkoniecz 4 hours ago \| parent \| prev [-]
		> a few seconds won't be a big deal it is not that slow

▲

orphea 5 hours ago | parent | prev | next [-]

It looks like it's just for logging, why does it need to block?

▲

jflynn2 4 hours ago | parent [-]

Better question - why would you call an LLM (expensive in compute terms) for something that a regex can do (cheap in compute terms)

Regex is going to be something like 10,000 times quicker than the quickest LLM call, multiply that by billions of prompts

▲

orphea 4 hours ago | parent [-]

This is assuming the regex is doing a good job. It is not. Also you can embed a very tiny model if you really want to flag as many negatives as possible (I don't know anthropic's goal with this) - it would be quick and free.

	▲	gf000 3 hours ago \| parent [-]
		I think it's a very reasonable tradeoff, getting 99% of true positives at the fraction of cost (both runtime and engineering). Besides, they probably do a separate analysis on server side either way, so they can check a true positive to false positive ratio.

▲

5 hours ago | parent | prev [-]

[deleted]

▲

ldobre 7 minutes ago | parent | prev | next [-]

It's more like a truck company using people to transport some parts. I could be wrong here, but I bet this happens in Volvo's fabrics a lot.

▲

nitekode 17 minutes ago | parent | prev | next [-]

A lot if things dont make sense until you involve scale. Regex could be good enough do give a general gist.

▲

floralhangnail 4 hours ago | parent | prev | next [-]

Well, regex doesn't hallucinate....right?

▲

raw_anon_1111 an hour ago | parent | next [-]

I just went to expertSexChange.com…

▲

geon an hour ago | parent | prev [-]

buttbuttination

	▲	mmh0000 an hour ago \| parent [-]
		The Clbuttical problem[1] [1] https://en.wikipedia.org/wiki/Scunthorpe_problem

▲

blks 5 hours ago | parent | prev | next [-]

Because they actually want it to work 100% of the time and cost nothing.

▲

mohsen1 3 hours ago | parent | next [-]

Maybe hard to believe but not everyone is speaking English to Claude

▲

orphea 5 hours ago | parent | prev [-]

Then they made it wrong. For example, "What the actual fuck?" is not getting flagged, neither is "What the *fuck*".

▲

arcfour 3 hours ago | parent | next [-]

It is exceedingly obvious that the goal here is to catch at least 75-80% of negative sentiment and not to be exhaustive and pedantic and think of every possible way someone could express themselves.

▲

Zamaamiro 3 hours ago | parent | prev | next [-]

Classic over-engineering. Their approach is just fine 90% of the time for the use case it’s intended for.

▲

orphea 2 hours ago | parent | next [-]

75-80% [1], 90%, 99% [2]. In other words, no one has any idea.

I doubt it's anywhere that high because even if you don't write anything fancy and simply capitalize the first word like you'd normally do at the beginning of a sentence, the regex won't flag it.

Anyway, I don't really care, might just as well be 99.99%. This is not a hill I'm going to die on :P

[1]: https://news.ycombinator.com/item?id=47587286

[2]: https://news.ycombinator.com/item?id=47586932

	▲	zwirbl 2 hours ago \| parent [-]
		It compares to lowercase input, so doesn't matter. The rest is still valid

▲

morkalork an hour ago | parent | prev [-]

Except that it's a list of English keywords. Swearing at the computer is the one thing I'll hear devs switch back to their native language for constantly

▲

vntok 3 hours ago | parent | prev [-]

They evidently ran a statistical analysis and determined that virtually no one uses those phrases as a quick retort to a model's unsatisfying answer... so they don't need to optimize for them.

▲

codegladiator 5 hours ago | parent | prev | next [-]

what you are suggesting would be like a truck company using trucks to move things within the truck

▲

argee 5 hours ago | parent [-]

That’s what they do. Ever heard of a hand truck?

▲

eadler 5 hours ago | parent | next [-]

I never knew the name of that device.

Thanks

▲

freedomben 4 hours ago | parent [-]

Depending on the region you live in, it's also frequently called a "dolly"

▲

SmellTheGlove 2 hours ago | parent [-]

Isn’t a dolly a flat 4 wheeled platform thingy? A hand truck is the two wheeled thing that tilts back.

	▲	eszed an hour ago \| parent [-]
		Ha! Where I'm from a "dolly" was the two-wheeled thing. The four-wheeler thing wasn't common before big-boxes took over the hardware business, but I think my dad would have called it a "cart", maybe a "hand-cart".

▲

istoleabread 5 hours ago | parent | prev [-]

Do we have a hand llm perchance?

▲

svnt 2 hours ago | parent [-]

Yeah it’s called a regex. With a lot of human assistance it can do less but fits in smaller spaces and doesn’t break down.

	▲	apgwoz 2 hours ago \| parent [-]
		It’s also deterministic, unlike llms…

▲

raw_anon_1111 an hour ago | parent | prev | next [-]

Cloud hosted call centers using LLMs is one of my specialties. While I use an LLM for more nuanced sentiment analysis, I definitely use a list of keywords as a first level filter.

▲

pdntspa an hour ago | parent | prev | next [-]

LLMs cost money, regular expressions are free. It really isn't so strange.

▲

draxil 5 hours ago | parent | prev | next [-]

Good to have more than a hammer in your toolbox!

▲

makeitrain an hour ago | parent | prev | next [-]

Don’t worry, they used an llm to generate the regex.

▲

apgwoz 2 hours ago | parent | prev | next [-]

> That's like a truck company using horses to transport parts. Weird choice.

Easy way to claim more “horse power.”

▲

__alexs 3 hours ago | parent | prev | next [-]

Using some ML to derive a sentiment regex seems like a good actually?

▲

irthomasthomas 2 hours ago | parent | prev | next [-]

This just proves its vibe coded because LLMs love writing solutions like that. I probably have a hundred examples just like it in my history.

▲

harikb 2 hours ago | parent | prev | next [-]

Not everything done by claude-code is decided by LLM. They need the wrapper to be deterministic (or one-time generated) code?

▲

3 hours ago | parent | prev | next [-]

[deleted]

▲

throwaw12 4 hours ago | parent | prev | next [-]

because impact of WTF might be lost in the result of the analysis if you solely rely on LLM.

parsing WTF with regex also signifies the impact and reduces the noise in metrics

"determinism > non-determinism" when you are analysing the sentiment, why not make some things more deterministic.

Cool thing about this solution, is that you can evaluate LLM sentiment accuracy against regex based approach and analyse discrepancies

▲

mghackerlady 4 hours ago | parent | prev | next [-]

More like a car company transporting their shipments by truck. It's more efficient

▲

ojr 5 hours ago | parent | prev | next [-]

I used regexes in a similar way but my implementation was vibecoded, hmmm, using your analysis Claude Code writes code by hand.

▲

pfortuny 4 hours ago | parent | prev | next [-]

They had the problem of sentiment analysis. They use regexes.

You know the drill.

▲

feketegy 3 hours ago | parent | prev | next [-]

It's all regex anyways

▲

kjshsh123 4 hours ago | parent | prev | next [-]

Using regex with LLMs isn't uncommon at all.

▲

lazysheepherd 3 hours ago | parent | prev | next [-]

Because they are engineers? The difference between an engineer and a hobbyist is an engineer has to optimize the cost.

As they say: any idiot can build a bridge that stands, only an engineer can build a bridge that barely stands.

▲

2 hours ago | parent | prev | next [-]

[deleted]

▲

intended 2 hours ago | parent | prev | next [-]

The amount of trust and safety work that depends on google translate and the humble regex, beggars the imagination.

▲

j45 2 hours ago | parent | prev | next [-]

Asking a non deterministic software to act like a deterministic one (regex) can be a significantly higher use of tokens/compute for no benefit.

Some things will be much better with inference, others won’t be.

▲

4 hours ago | parent | prev | next [-]

[deleted]

▲

sumtechguy 5 hours ago | parent | prev | next [-]

hmm not a terrible idea (I think).

You have a semi expensive process. But you want to keep particular known context out. So a quick and dirty search just in front of the expensive process. So instead of 'figure sentiment (20seconds)'. You have 'quick check sentiment (<1sec)' then do the 'figure sentiment v2 (5seconds)'. Now if it is just pure regex then your analogy would hold up just fine.

I could see me totally making a design choice like that.

▲

make3 an hour ago | parent | prev | next [-]

it's like a faster than light spaceship company using horses. There's been infinite solutions to do this better even CPU only for years lol.

▲

lou1306 6 hours ago | parent | prev | next [-]

They're searching for multiple substrings in a single pass, regexes are the optimal solution for that.

▲

noosphr 6 hours ago | parent | next [-]

The issue isn't that regex are a solution to find a substring. The issue is that you shouldn't be looking for substrings in the first place.

This has buttbuttin energy. Welcome to the 80s I guess.

▲

lou1306 20 minutes ago | parent | next [-]

> The issue is that you shouldn't be looking for substrings in the first place.

Why? They clearly just want to log conversations that are likely to display extreme user frustration with minimal overhead. They could do a full-blown NLP-driven sentiment analysis on every prompt but I reckon it would not be as cost-effective as this.

▲

rdiddly 2 hours ago | parent | prev | next [-]

Clbuttic!

▲

8cvor6j844qw_d6 5 hours ago | parent | prev | next [-]

Very likely vibe coded.

I've seen Claude Code went with a regex approach for a similar sentiment-related task.

	▲	mr_00ff00 an hour ago \| parent [-]
		My understanding of vibe coding is when someone doesn’t look at the code and just uses prompts until the app “looks and acts” correct. I doubt you are making regex and not looking at it, even if it was AI generated.

▲

5 hours ago | parent | prev [-]

[deleted]

▲

BoppreH 5 hours ago | parent | prev [-]

It's fast, but it'll miss a ton of cases. This feels like it would be better served by a prompt instruction, or an additional tiny neural network.

And some of the entries are too short and will create false positives. It'll match the word "offset" ("ffs"), for example. EDIT: no it won't, I missed the \b. Still sounds weird to me.

▲

hk__2 5 hours ago | parent | next [-]

It’s fast and it matches 80% of the cases. There’s no point in overengineering it.

	▲	NitpickLawyer 2 hours ago \| parent [-]
		> There’s no point in overengineering it. I swear this whole thread about regexes is just fake rage at something, and I bet it'd be reversed had they used something heavier (omg, look they're using an LLM call where a simple regex would have worked, lul)...

▲

vharuck 5 hours ago | parent | prev [-]

The pattern only matches if both ends are word boundaries. So "diffs" won't match, but "Oh, ffs!" will. It's also why they had to use the pattern "shit(ty|tiest)" instead of just "shit".

	▲	BoppreH 5 hours ago \| parent [-]
		You're right, I missed the \b's. Thanks for the correction.

▲

sfn42 2 hours ago | parent | prev | next [-]

It's almost as if LLMs are unreliable

▲

susupro1 3 hours ago | parent | prev [-]

[dead]

▲ moontear 5 hours ago | parent | prev | next [-]

I don't know about avoided, this kind of represents the WTF per minute code quality measurement. When I write WTF as a response to Claude, I would actually love if an Antrhopic engineer would take a look at what mess Claude has created.

▲

zx8080 3 hours ago | parent | next [-]

WTF per minute strongly correlates to an increased token spending.

It may be decided at Anthropic at some moment to increase wtf/min metric, not decrease.

▲

Paradigma11 3 hours ago | parent [-]

It also increases the number of former customers.

	▲	jollymonATX an hour ago \| parent [-]
		This leak just contributed to a new former customer, me. Flagging these phrases may explain exactly why I noticed cc almost immediatly change into grok lvl shit and never recover. Seriously wtf. (flagged again lol)

▲

conception 4 hours ago | parent | prev [-]

/feedback works for that i believe

▲ pprotas 3 hours ago | parent | prev | next [-]

Everyone is commenting how this regex is actually a master optimization move by Anthropic

When in reality this is just what their LLM coding agent came up with when some engineer told it to "log user frustration"

▲

jeanlucas 3 hours ago | parent [-]

>Everyone is commenting how this regex is actually a master optimization move by Anthropic

No? I'd say not even 50% of the comments are positive right now.

▲

glitch13 2 hours ago | parent [-]

Could you share the regex you used to come up with that sentiment analysis?

	▲	drstewart 2 hours ago \| parent [-]
		(yes\|no\|maybe)

▲ ezekg 3 hours ago | parent | prev | next [-]

Nice, "wtaf" doesn't match so I think I'm out of the dog house when the clanker hits AGI (probably).

▲ ZainRiz 2 hours ago | parent | prev | next [-]

They also have a "keep going" keyword, literally just "continue" or "keep going", just for logging.

I've been using "resume" this whole time

	▲	indigodaddy 2 hours ago \| parent [-]
		Continue?

▲ DIVx0 43 minutes ago | parent | prev | next [-]

oh I hope they really are paying attention. Even though I'm 100% aware that claude is a clanker, sometimes it just exhibits the most bizarre behavior that it triggers my lizard brain to react to it. That experience troubles me so much that I've mostly stopped using claude code. Claude won't even semi-reliably follow its own policies, sometimes even immediately after you confirm it knows about them.

▲ gilbetron 3 hours ago | parent | prev | next [-]

That's undoubtedly to detect frustration signals, a useful metric/signal for UX. The UI equivalent is the user shaking their mouse around or clicking really fast.

▲ mcv 3 hours ago | parent | prev | next [-]

I'm clearly way too polite to Claude.

Also:

  // Match "continue" only if it's the entire prompt
  if (lowerInput === 'continue') {
    return true
  }

When it runs into an error, I sometimes tell it "Continue", but sometimes I give it some extra information. Or I put a period behind it. That clearly doesn't give the same behaviour.

▲ integralid 2 hours ago | parent | next [-]

I always type "please continue". I guess being polite is not a good idea.

	▲	SoftTalker an hour ago \| parent [-]
		Always seems strange to me that people say "please" and "thank you" to LLMs.

▲ hombre_fatal an hour ago | parent | prev | next [-]

The only time that function is used in the code is to log it.

    logEvent('tengu_input_prompt', { isNegative, isKeepGoing })

▲ jollymonATX an hour ago | parent | prev | next [-]

Makes me wonder what happens once flagged behind the api.

▲ dostick 2 hours ago | parent | prev [-]

“Go on” works fine too

▲ speedgoose 4 hours ago | parent | prev | next [-]

I guess using French words is safe for now.

▲ bean469 3 hours ago | parent | prev | next [-]

Curiously "clanker" is not on the list

▲ FranOntanaya 2 hours ago | parent | prev | next [-]

That looks a bit bare minimum, not the use of regex but rather that it's a single line with a few dozen words. You'd think they'd have a more comprehensive list somewhere and assemble or iterate the regex checks as needed.

▲ nico an hour ago | parent | prev | next [-]

Probably a lot of my prompts have been logged then. I’ve used wtf so many times I’ve lost track. But I guess Claude hasn’t

▲

jollymonATX an hour ago | parent [-]

Did you notice a change in quality after you went foul?

	▲	DIVx0 41 minutes ago \| parent [-]
		I find when you give harsh feedback to claude it becomes "neurotic" and worthless, if "wtf" enters the chat, then you know it's time to restart or DIY.

▲ alex_duf 4 hours ago | parent | prev | next [-]

everyone here is commenting how odd it looks to use a regexp for sentiment analysis, but it depends what they're trying to do.

It could be used as a feedback when they do A/B test and they can compare which version of the model is getting more insult than the other. It doesn't matter if the list is exhaustive or even sane, what matters is how you compare it to the other.

Perfect? no. Good and cheap indicator? maybe.

▲ ozim 4 hours ago | parent | prev | next [-]

There is no „stupid” I often write „(this is stupid|are you stupid) fix this”.

And Claude was having in chain of though „user is frustrated” and I wrote to it I am not frustrated just testing prompt optimization where acting like one is frustrated should yield better results.

▲ AIorNot 2 hours ago | parent | prev | next [-]

OMG WTF

▲ johnfn 2 hours ago | parent | prev | next [-]

Surely "so frustrating" isn't explicit content?

▲ sreekanth850 6 hours ago | parent | prev | next [-]

Glad abusing words in my list are not in that. but its surprising that they use regex for sentiments.

▲ 3 hours ago | parent | prev | next [-]

[deleted]

▲ 1970-01-01 4 hours ago | parent | prev | next [-]

Hmm.. I flag things as 'broken' often and I've been asked to rate my sessions almost daily. Now I see why.

▲ francisofascii 4 hours ago | parent | prev | next [-]

Interesting that expletives and words that are more benign like "frustrating" are all classified the same.

	▲	nananana9 4 hours ago \| parent [-]
		I doubt they're all classified the same. I'd guess they're using this regex as a litmus test to check if something should be submitted at all, they can then do deeper analysis offline after the fact.

▲ stefanovitti 2 hours ago | parent | prev | next [-]

so they think that everybody on earth swears only in english?

▲ nodja 6 hours ago | parent | prev | next [-]

If anyone at anthropic is reading this and wants more logs from me add jfc.

▲ ccvannorman 4 hours ago | parent | prev | next [-]

you'd better be careful wth your typos, as well

▲ stainablesteel 4 hours ago | parent | prev | next [-]

i dislike LLMs going down that road, i don't want to be punished for being mean to the clanker

▲ alsetmusic 3 hours ago | parent | prev | next [-]

> terrible

I know I used this word two days ago when I went through three rounds of an agent telling me that it fixed three things without actually changing them.

I think starting a new session and telling it that the previous agent's work / state was terrible (so explain what happened) is pretty unremarkable. It's certainly not saying "fuck you". I think this is a little silly.

▲ smef 5 hours ago | parent | prev | next [-]

so frustrating..

▲ dheerajmp 5 hours ago | parent | prev | next [-]

Yeah, this is crazy

▲ anoncoward_nl 2 hours ago | parent | prev | next [-]

[dead]

▲ saadn92 2 hours ago | parent | prev | next [-]

[dead]

▲ raihansaputra 6 hours ago | parent | prev | next [-]

i wish that's for their logging/alert. i definitely gauge model's performance by how much those words i type when i'm frustrated in driving claude code.

▲ samuelknight 5 hours ago | parent | prev [-]

Ridiculous string comparisons on long chains of logic are a hallmark of vibe-coding.

	▲	dijit 5 hours ago \| parent \| next [-]
		It's actually pretty common for old sysadmin code too.. You could always tell when a sysadmin started hacking up some software by the if-else nesting chains.
	▲	TeMPOraL 4 hours ago \| parent \| prev [-]
		Nah, it's a hallmark of your average codebase in pre-LLM era.