Excerpt from the IEEE Spectrum article:
“Learning to Fly by Crashing,” a paper from CMU roboticists Dhiraj Gandhi, Lerrel Pinto, and Abhinav Gupta, has such a nice abstract that I’ll just let them explain what this research is all about:
[T]he gap between simulation and real world remains large especially for perception problems. The reason most research avoids using large-scale real data is the fear of crashes! In this paper, we propose to bite the bullet and collect a dataset of crashes itself! We build a drone whose sole purpose is to crash into objects [. . .] We use all this negative flying data in conjunction with positive data sampled from the same trajectories to learn a simple yet powerful policy for UAV navigation.
Cool, let’s get crashing!
One way to think of flying (or driving or walking or any other form of motion) is that success is simply a continual failure to crash. From this perspective, the most effective way of learning how to fly is by getting a lot of experience crashing so that you know exactly what to avoid, and once you can reliably avoid crashing, you by definition know how to fly. Simple, right? We tend not to learn this way, however, because crashing has consequences that are usually quite bad for both robots and people.
The CMU roboticists wanted to see if there are any benefits to using the crash approach instead of the not crash approach, so they sucked it up and let an AR Drone 2.0 loose in 20 different indoor environments, racking up 11,500 collisions over the course of 40 hours of flying time. As the researchers point out, “since the hulls of the drone are cheap and easy to replace, the cost of catastrophic failure is negligible.” Each collision is random, with the drone starting at a random location in the space and then flying slowly forward until it runs into something. After it does, it goes back to its starting point, and chooses a new direction. Assuming it survives, of course.
Once a collision happens, the images from the trajectory are split into two parts: the part where the drone was doing fine, and the part just before it crashes. These two sets of images are fed into a deep convolutional neural network, which uses them to learn whether a given camera image means that going straight is a good idea or not. After 11,500 collisions, the resulting algorithm is able to fly the drone autonomously, even in narrow, cluttered environments.
During this process, the drone’s forward-facing camera is recording images at 30 Hz. Once a collision happens, the images from the trajectory are split into two parts: the part where the drone was doing fine, and the part just before it crashes. These two sets of images are fed into a deep convolutional neural network (with ImageNet-pretrained weights as initialization for the network), which uses them to learn, essentially, whether a given camera image means that going straight is a good idea or not. After 11,500 collisions, the resulting algorithm is able to fly the drone autonomously, even in narrow, cluttered environments, around moving obstacles, and in the midst of featureless white walls and even glass doors. The algorithm that controls the drone is simple: It splits the image from the AR Drone’s forward camera into a left image and a right image, and if one of those two images looks less collision-y than going straight, the drone turns in that direction. Otherwise, it continues moving forward