▲ | eurekin 2 days ago | |||||||||||||||||||||||||
My very brief interaction with GPT5 is that it's just weird. "Sure, I'll help you stop flirting with OOMs" "Thought for 27s Yep-..." (this comes out a lot) "If you still graze OOM at load" "how far you can push --max-model-len without more OOM drama" - all this in a prolonged discussion about CUDA and various llm runners. I've added special user instructions to avoid flowery language, but it gets ignored. EDIT: it also dragged conversation for hours. I ended up going with latest docs and finally, all issues with CUDA in a joint tabbyApi and exllamav2 project cleared up. It just couldn't find a solution and kept proposing, whatever people wrote in similar issues. It's reasoning capabilities are in my eyes greatly exaggarated. | ||||||||||||||||||||||||||
▲ | mh- 2 days ago | parent [-] | |||||||||||||||||||||||||
Turn off the setting that lets it reference chat history; it's under Personalization. Also take a peek at what's in Memories (which is separate from the above); consider cleaning it up or disabling entirely. | ||||||||||||||||||||||||||
|