| ▲ | gpm 6 hours ago | |
> Just yesterday I asked for a comparison of three technical books on a similar topic, and it wrongly guessed the third one rather than follow the three links. I would consider this a failure in their tool use capabilities, not their reading ones. To use them to read things (without relying on their much less reliable tool use) take the thing and put it in the context window yourself. They still aren't perfect of course, but they are reasonably good. Three whole books likely exceeds their context window size of course, I'd take this as a sign that they aren't up to a task of that magnitude yet. | ||
| ▲ | kace91 an hour ago | parent [-] | |
>Three whole books likely exceeds their context window size of course This was not “read all three books”, this was “check these three links with the (known) book synopsis/reviews there” and it made up the third one. >I would consider this a failure in their tool use capabilities, not their reading ones. Id give it to you if I got an error message, but the text being enhanced with wrong-but-plausible data is clearly a failure of reliability. | ||