| ▲ | glitchc 15 hours ago | |
This is still a "hard" problem from a scientific perspective. LLMs haven't taken us any closer to solving the perception, actuation, learning loop. It will require multiple new developments in material science and a new ML paradigm. | ||
| ▲ | jevndev 15 hours ago | parent | next [-] | |
This is true about LLMs themselves but the developments behind them have been a boon for robotics. I’m mostly familiar with computer vision so I can’t speak to everything, but vision transformers (ViTs is the term to search for) have helped a ton with persistence of object detection/tracking. And depth estimation techniques for monocular cameras have accelerated from the top of the line raw cnn based models from just a few years ago; largely by adding attention layers to their model. I agree that they’re not there yet but I don’t want to discredit the benefits of these recent advancements | ||
| ▲ | pixl97 14 hours ago | parent | prev [-] | |
While you're correct we still need a lot more, the advances in the past 5 years represent more than I've seen in most of my life. Just look at the speed in which we can train a humanoid robot things now. We can send out a mo-cap human, get some data, and in few hours run a few hundred trillion simulations, and publish a kernel that can do that task relatively well. LLMs allow us any perception at all. They feed vision to scene comprehension an then let the robot control part calculate a plan to achieve a goal. It's not very fast, and fine motor controls have a long way to go, but it is possible. | ||