That's what I found with some of these LLM models as well. For example I still like to test those models with algorithm problems, and sometimes when they can't actually solve the problem, they will start to hardcode the test cases into the algorithm itself.. Even DeepSeek was doing this at some point, and some of the most recent ones still do this.

▲

edoceo 4 hours ago | parent | next [-]

Sounds exactly what a junior-dev would do without proper guidance. Could better direction in the prompts help? I find I frequently have to tell it where to put what fixes. IME they make a lot of spaghetti (LLMs and juniors)

▲

throawayonthe an hour ago | parent | next [-]

wtf kinda juniors are you interacting with

	▲	edoceo 34 minutes ago \| parent [-]
		Lots of self-taught; looking for an entry level.

▲

heliumtera 4 hours ago | parent | prev [-]

Maybe the Juniors you have seen are actually retarded?

▲

qinsignificance 4 hours ago | parent | prev [-]

I have asked GLM4.7 in opencode to make an application to basically filter a couple of spatial datasets hosted at a url I provided it, and instead of trying to download read the dataset, it just read the url, assumed what the datasets were (and got it wrong) is and it's shape (and got it wrong) and the fields (and got it wrong) and just built an application based on vibes that was completely unfixable.

It wrote an extensive test suite on just fake data and then said the app is perfectly working as all tests passed.

This is a model that was supposed to match sonnet 4.5 in benchmarks. I don't think sonnet would be that dumb.

I use LLMs a lot to code, but these chinese models don't match anthropic and openai in being able to decide stuff for themselves. They work well if you give them explicit instructions that leaves little for it to mess up, but we are slowly approaching where OpenAI and anthropic models will make the right decisions on their own

	▲	esafak 3 hours ago \| parent [-]
		It really is infuriatingly dumb; like a junior who does not know English. Indeed, it often transitions into Chinese. Just now it added some stuff to a file starting at L30 and I said "that one line L30 will do remove the rest", it interpreted 'the rest' as the file, and not what it added.