Remix.run Logo
mejutoco 2 hours ago

> Even the most basic questions such as put a ball in a cup and place it on a table upside down then pick up the cup and put it in a box.

I do not think this is a great example. First, it is not a question. Second, it seems very related to robotics. A model itself cannot put a ball anywhere, it can just call tools and answer in text, image, etc.

An LLM seeing "put a x in a y and place it on a z upside down then pick up the y and put it in a z2." and then a question about what happens could check a rag for properties of those x,y,z,z2 and still answer. Alternatively, this could be useful for coding, for example. And that is a very extreme example. Some basic language plus tool use could go quite far. I think it is a very interesting direction vs here is a gpu the price of a car.