▲ | canada_dry 4 days ago | |||||||
OpenAI's "PRO" subscription is really a waste of money IMHO for this and other reasons. Decided to give PRO a try when I kept getting terrible results from the $20 option. So far it's perhaps 20% improved in complex code generation. It still has the extremely annoying ~350 line limit in its output. It still IGNORES EXPLICIT CONTINUOUS INSTRUCTIONS eg: do not remove existing comments. The opaque overriding rules that - despite it begging forgiveness when it ignores instructions - are extremely frustrating!! | ||||||||
▲ | JoshuaDavid 4 days ago | parent | next [-] | |||||||
One thing that has worked for me when I have a long list of requirements / standards I want an LLM agent to stick to while executing a series of 5 instructions is to add extra steps at the end of the instructions like "6. check if any of the code standards are not met - if not, fix them and return to step 5" / "7. verify that no forbidden patterns from <list of things like no-op unit tests, n+1 query patterns, etc> exist in added code - if you find any, fix them and return to step 5" etc. Often they're better at recognizing failures to stick to the rules and fixing the problems than they are at consistently following the rules in a single shot. This does mean that often having an LLM agents so a thing works but is slower than just doing it myself. Still, I can sometimes kick off a workflow before joining a meeting, so maybe the hours I've spent playing with these tools will eventually pay for themselves in improved future productivity. | ||||||||
▲ | jmaker 4 days ago | parent | prev [-] | |||||||
There are things it’s great at and things it deceives you with. In many things I needed it to check something for me I knew was a problem, o3 kept insisting it were possible due to reasons a,b,c, and thankfully gave me links. I knew it used to be a problem so surprised I followed the links only to read black on white it still wasn’t. So I explained to o3 that it’s wrong. Two messages later we were back at square one. One week later it didn’t update its knowledge. Months later it’s still the same. But at things I have no idea about like medicine it feels very convincing. Am I in hazard? People don’t understand Dunning-Kruger. People are prone to biases and fallacies. Likely all LLMs are inept at objectivity. My instructions to LLMs are always strictness, no false claims, Bayesian likelihoods on every claim. Some modes ignore the instructions voluntarily, while others stick strictly to them. In the end it doesn’t matter when they insist on 99% confidence on refuted fantasies. | ||||||||
|