Robotics

Part of the rise of the current batch of humanoids have worked well is because "Behavior Cloning" has made RL a more solvable problem that stumped robotics for years. That's why end-to-end policies that run directly from images is exploding. What that means:

Why it's exciting: 
- Classical robotics: traditionally relied on explicitly programming, the controls, planning, models etc define state-action mappings, tune params, and build for edge cases. Does not generalize well

- Requires little explicit programming -- you just need to collect data of an expert performing a task, then train a model to copy it. It's supervised by nature. Then doing new tasks is as simple as collecting more data. No need to constantly build and support for unknown edge cases

Part of the rise of the current batch of humanoids have worked well is because "Behavior Cloning" has made RL a more solvable problem that stumped robotics for years. That's why end-to-end policies that run directly from images is exploding. What that means:

Why it's exciting: 
- Classical robotics: traditionally relied on explicitly programming, the controls, planning, models etc define state-action mappings, tune params, and build for edge cases. Does not generalize well

- Requires little explicit programming -- you just need to collect data of an expert performing a task, then train a model to copy it. It's supervised by nature. Then doing new tasks is as simple as collecting more data. No need to constantly build and support for unknown edge cases

https://www.imgeorgiev.com/2025-01-31-why-bc-not-rl/

creating and destroying; flying cars, roc.camera, july.rocks

Is it now looking like such an approach can work in arbitrary environments?

Behaviour cloning seems to speed up greatly time-to-market in controlled environments. Elsewhere, wondering what the guarantees are, notably verifiable “nothing bad can happen” kind of property (quoted as too strong a shortcut expression).