| ▲ | katet 5 hours ago | |||||||||||||
Not that I've had to deal with this specifically, but I have noticed how the input phrasing in my prompts pushes the LLM in different directions. I've just tried a quick test with `duck.ai` on gpt 4o-mini with: A: Why is drinking coffee every day so good for you? B: Why is drinking coffee every day so bad for you? Question A responds that it has "several health benefits", antioxidants, liver health, reduced risk of diabetes and Parkinson's. Question B responds that it may lead to sleep disruption, digestive issues, risk of osteoporosis. Same question. One word difference. Two different directions. This makes me take everything with a pinch of salt when I ask "Would Library A be a good fit for Problem X" - which is obviously a bit leading; I don't even trust what I hope are more neutral inputs like "How does Library A apply to Problem Space X", for example. | ||||||||||||||
| ▲ | ericpauley 5 hours ago | parent | next [-] | |||||||||||||
Again a model issue. At the risk of coming off as a thread-wide apologist, here are my results on Opus: Good: > The research is generally positive but it’s not unconditionally “good for you” — the framing matters. > What the evidence supports for moderate consumption (3-5 cups/day): lower risk of type 2 diabetes, Parkinson’s, certain liver diseases (including liver cancer), and all-cause mortality…… Bad: > The premise is off. Moderate daily coffee consumption (3-5 cups) isn’t considered bad for you by current medical consensus. It’s actually associated with reduced risk of type 2 diabetes, Parkinson’s, and some liver diseases in large epidemiological studies. > Where it can cause problems: Heavy consumption (6+ cups) can lead to anxiety, insomnia…… This isn’t just my own one-off examples. Claude dominates the BSBench: https://petergpt.github.io/bullshit-benchmark/viewer/index.v... | ||||||||||||||
| ||||||||||||||
| ▲ | tayo42 5 hours ago | parent | prev | next [-] | |||||||||||||
A person would respond the same way? What exactly are you expecting as the output to those questions? | ||||||||||||||
| ||||||||||||||
| ▲ | whattheheckheck 5 hours ago | parent | prev [-] | |||||||||||||
Both are true though | ||||||||||||||