Remix.run Logo
captainbland 4 hours ago

I feel like this is a feature which improves the perceived confidence of the LLM but doesn't do much for correctness of other outputs, i.e. an exacerbation of the "confidently incorrect" criticism.

kemayo 2 hours ago | parent | next [-]

It's a mismatch with our intuition about how much effort things take.

If there's humans involved, "I took this data and made a really fancy interactive chart" means that you put a lot more work into it, and you can probably somewhat assume that this means some more effort was also put into the accuracy of the data.

But with the LLM it's not really very much more work to get the fancy chart. So the thing that was a signifier of effort is now misleading us into trusting data that got no extra effort.

(Humans have been exploiting this tendency to trust fancy graphics forever, of course.)

manquer 26 minutes ago | parent [-]

It is not limited to graphics, better packaged products, better dressed / good looking well spoken person and so on. Celebrity endorsements depend on this thesis.

There has always been a bias towards form over function.

programmertote 3 hours ago | parent | prev | next [-]

A recent LinkedIn post that I came across as an example of people trusting (or learning to trust) AI too much while not realizing that it can make up numbers too: https://www.linkedin.com/posts/mariamartin1728_claude-wrote-...

P.S. Credit to the poster, she posted a correction note when someone caught the issue: https://www.linkedin.com/posts/mariamartin1728_correction-on...

elliotbnvl 3 hours ago | parent | prev | next [-]

It's a usability / quality of life feature to me. Nothing to do with increasing perceived confidence. I guess it depends on how much you already (dis)trust LLMs.

I'm finding more and more often the limiting factor isn't the LLM, it's my intuition. This goes a way towards helping with that.

vunderba 3 hours ago | parent | prev | next [-]

A similar thing happened when Google started really pushing generating flowcharts as a use-case with Nano Banana. A slick presentation can distract people from the only thing that really matters - the accuracy of the underlying data.

Angostura 2 hours ago | parent [-]

As a slightly different tack, I’ve been using Copilot to generate flowcharts from some of the fiendishly complex (and badly written) standard operating procedures we have at work.

People find them quite easy to check - easier than the raw document. My angle with teams is use these to check your processes. If the flow is wrong it’s either because the LLM has screwed up, or because the policy is wrong/badly written. It’s usually the latter. It’s a good way to fix SOPs

vunderba 25 minutes ago | parent [-]

It’s interesting you mentioned that. One of the things I’ve started doing recently is throwing a large LLM such as codex-5.3 (highest level of reasoning) at some of the more complex systems we have to produce nicely formatted ASCII diagrams.

I still review each diagram afterward, but the great thing is that, unlike image-based diagrams, they remain fully text-readable and searchable. And you can even expose them as part of the knowledge base for the LLM to reference when needed going forward.

ipython an hour ago | parent | prev | next [-]

Already happened. :)

https://www.reddit.com/r/dataisugly/comments/1mk5wdb/this_ch...

outlore an hour ago | parent | prev | next [-]

I suspect chain of thought while building the chart will improve the overall correctness of the answer

an hour ago | parent | prev | next [-]
[deleted]
nerdjon 3 hours ago | parent | prev | next [-]

This was my first thought as well, all this does is further remove the user from seeing the chat output and instead makes it appear as if the information is concretely reliable.

I mean is it really that shocking that you can have an LLM generate structured data and shove that into a visualizer? The concern is if is reliable, which we know it isnt.

ericmcer 3 hours ago | parent | next [-]

The further they can get people from the reality of `This just spits out whatever it thinks the next token will be` the more they can push the agenda.

j45 3 hours ago | parent | prev [-]

Its' a reasonable concern. Often it can be mitigated by prompting in a manner that invokes research and verification instead of defaulting to a corpus.

Passive questions generate passive responses.

mikkupikku 3 hours ago | parent | prev [-]

I agree. Maybe next they'll add emotionally evocative music, with swelling orchestral bits when you reach the exciting climate of the slop.