▲ | sieve 11 hours ago | ||||||||||||||||||||||||||||||||||
They are very good at some tasks and terrible at others. I use LLMs for language-related work (translations, grammatical explanations etc) and they are top notch in that as long as you do not ask for references to particular grammar rules. In that case they will invent non-existent references. They are also good for tutor personas: give me jj/git/emacs commands for this situation. But they are bad in other cases. I started scanning books recently and wanted to crop the random stuff outside an orange sheet of paper on which the book was placed before I handed the images over to ScanTailor Advanced (STA can do this, but I wanted to keep the original images around instead of the low-quality STA version). I spent 3-5 hours with Gemini 2.5 Pro (AI Studio) trying to get it to give me a series of steps (and finally a shell script) to get this working. And it could not do it. It mixed up GraphicsMagick and ImageMagick commands. It failed even with libvips. Finally I asked it to provide a simple shell script where I would provide four pixel distances to crop from the four edges as arguments. This one worked. I am very surprised that people are able to write code that requires actual reasoning ability using modern LLMs. | |||||||||||||||||||||||||||||||||||
▲ | noosphr 11 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||
Just use Pillow and python. It is the only way to do real image work these days, and as a bonus LLMs suck a lot less at giving you nearly useful python code. The above is a bit of a lie as opencv has more capabilities, but unless you are deep in the weeds of preparing images for neural networks pillow is plenty good enough. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | BOOSTERHIDROGEN 11 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
Would you share your system prompt for that grammatical checker? | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | poszlem 11 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||
I think Gemini is one of the best example of an LLM that is in some cases the best and in some cases truly the worst. I once asked it to read a postcard written by my late grandfather in Polish, as I was struggling to decipher it. It incorrectly identified the text as Romanian and kept insisting on that, even after I corrected it: "I understand you are insistent that the language is Polish. However, I have carefully analyzed the text again, and the linguistic evidence confirms it is Romanian. Because the vocabulary and alphabet are not Polish, I cannot read it as such." Eventually, after I continued to insist that it was indeed Polish, it got offended and told me it would not try again, accusing me of attempting to mislead it. | |||||||||||||||||||||||||||||||||||
|