| ▲ | Jacques2Marais 2 hours ago |
| You would be surprised, however, at how much detail humans also need to understand each other. We often want AI to just "understand" us in ways many people may not initially have understood us without extra communication. |
|
| ▲ | jstummbillig 2 hours ago | parent | next [-] |
| People poorly specifying problems and having bad models of what the other party can know (and then being surprised by the outcome) is certainly a more general albeit mostly separate issue. |
| |
| ▲ | ahofmann 2 hours ago | parent [-] | | This issue is the main reason why a big percentage of jobs in the world exist. I don't have hard numbers, but my intuition is that about 30% of all jobs are mainly "understand what side a wants and communicate this to side b, so that they understand". Or another perspective: almost all jobs that are called "knowledge work" are like this. Software development is mainly this. Side a are humans, side b is the computer. The main goal of ai seems to get into this space and make a lot of people superflous and this also (partly) explains why everyone is pouring this amount of money into ai. | | |
| ▲ | PaulRobinson an hour ago | parent [-] | | Developers are - on average - terrible at this. If they weren't, TPMs, Product Managers, CTOs, none of them would need to exist. It's not specific to software, it's the entire World of business. Most knowledge work is translation from one domain/perspective to another. Not even knowledge work, actually. I've been reading some works by Adler[0] recently, and he makes a strong case for "meaning" only having a sense to humans, and actually each human each having a completely different and isolated "meaning" to even the simplest of things like a piece of stone. If there is difference and nuance to be found when it comes to a rock, what hope have we got when it comes to deep philosophy or the design of complex machines and software? LLMs are not very good at this right now, but if they became a lot better at, they would a) become more useful and b) the work done to get them there would tell us a lot about human communication. [0] https://en.wikipedia.org/wiki/Alfred_Adler |
|
|
|
| ▲ | londons_explore 2 hours ago | parent | prev | next [-] |
| This is why we fed it the whole internet and every library as training data... By now it should know this stuff. |
| |
| ▲ | jasongi 13 minutes ago | parent [-] | | Future models know it now, assuming they suck in mastodon and/or hacker news. Although I don't think they actually "know" it. This particular trick question will be in the bank just like the seahorse emoji or how many Rs in strawberry. Did they start reasoning and generalising better or did the publishing of the "trick" and the discourse around it paper over the gap? I wonder if in the future we will trade these AI tells like 0days, keeping them secret so they don't get patched out at the next model update. |
|
|
| ▲ | scott_w an hour ago | parent | prev | next [-] |
| > You would be surprised, however, at how much detail humans also need to understand each other. But in this given case, the context can be inferred. Why would I ask whether I should walk or drive to the car wash if my car is already at the car wash? |
| |
| ▲ | pickleRick243 40 minutes ago | parent [-] | | But also why would you ask whether you should walk or drive if the car is at home? Either way the answer is obvious, and there is no way to interpret it except as a trick question. Of course, the parsimonious assumption is that the car is at home so assuming that the car is at the car wash is a questionable choice to say the least (otherwise there would be 2 cars in the situation, which the question doesn't mention). | | |
| ▲ | DharmaPolice 13 minutes ago | parent | next [-] | | I think a good rule of thumb is to default to assuming a question is asked in good faith (i.e. it's not a trick question). That goes for human beings and chat/AI models. In fact, it's particularly true for AI models because the question could have been generated by some kind of automated process. e.g. I write my schedule out and then ask the model to plan my day. The "go 50 metres to car wash" bit might just be a step in my day. | |
| ▲ | scott_w 17 minutes ago | parent | prev [-] | | But you're ascribing understanding to the LLM, which is not what it's doing. If the LLM understood you, it would realise it's a trick question and, assuming it was British, reply with "You'd drive it because how else would you get it to the car wash you absolute tit." Even the higher level reasoning, while answering the question correctly, don't grasp the higher context that the question is obviously a trick question. They still answer earnestly. Granted, it is a tool that is doing what you want (answering a question) but let's not ascribe higher understanding than what is clearly observed - and also based on what we know about how LLMs work. |
|
|
|
| ▲ | kitd 14 minutes ago | parent | prev | next [-] |
| Given that an estimated 70% of human communication is non-verbal, it's not so surprising though. |
|
| ▲ | j_maffe 2 hours ago | parent | prev | next [-] |
| Right. But, unlike AI, we are usually aware when we're lacking context and inquire before giving an answer. |
| |
| ▲ | dxdm 2 hours ago | parent [-] | | Wouldn't that be nice. I've been party and witness to enough misunderstandings to know that this is far from universally true, even for people like me who are more primed than average to spot missing context. |
|
|
| ▲ | jiggawatts an hour ago | parent | prev [-] |
| I regularly tell new people at work to be extremely careful when making requests through the service desk — manned entirely by humans — because the experience is akin to making a wish from an evil genie. You will get exactly what you asked for, not what you wanted… probably. (Random occurrences are always a possibility.) E.g.: I may ask someone to submit a ticket to “extend my account expiry”. They’ll submit: “Unlock Jiggawatts’ account” The service desk will reset my password (and neglect to tell me), leaving my expired account locked out in multiple orthogonal ways. That’s on a good day. Last week they created Jiggawatts2. The AIs have got to be better than this, surely! I suspect they already are. People are testing them with trick questions while the human examiner is on edge, aware of and looking for the twist. Meanwhile ordinary people struggle with concepts like “forward my email verbatim instead of creatively rephrasing it to what you incorrectly though it must have really meant.” |
| |
| ▲ | scott_w an hour ago | parent [-] | | There's a lot of overlap between the smartest bears and the dumbest humans. However, we would want our tools to be more useful than the dumbest humans... |
|