| ▲ | Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents(arxiv.org) | |
| 14 points by distalx 8 months ago | 3 comments | ||
| ▲ | 8 months ago | parent | next [-] | |
| [deleted] | ||
| ▲ | 8 months ago | parent | prev | next [-] | |
| [deleted] | ||
| ▲ | 8 months ago | parent | prev [-] | |
| [deleted] | ||