Previous work has dealt with a robot placing single objects on a flat surface, says Ashutosh Saxena, assistant professor of computer science at Cornell University. “Our major contribution is that we are now looking at a group of objects, and this is the first work that places objects in non-trivial places.” New algorithms allow the robot to consider the nature of an object in deciding what to do with it. “It learns not to put a shoe in the refrigerator,” explains graduate student Yun Jiang. And while a shoe can be placed stably on any flat surface, it should go on the floor, not on a table. The researchers tested placing dishes, books, clothing, and toys on tables and in bookshelves, dish racks, refrigerators, and closets. The robot was up to 98 percent successful in identifying and placing objects it had seen before. It was able to place objects it had never seen before, but success rates fell to an average of 80 percent. Ambiguously shaped objects, such as clothing and shoes, were most often misidentified. The algorithms—the underlying methods a computer is programmed to follow—are described in the May online edition of the International Journal of Robotics. Some aspects of the work were presented recently at the International Conference on Robotics and Automation. The robot begins by surveying the room with a Microsoft Kinect 3-D camera, originally made for video gaming but now being widely used by robotics researchers. Many images are stitched together to create an overall view of the room, which the robot’s computer divides into blocks based on discontinuities of color and shape. The robot has been shown several examples of each kind of object and learns what characteristics they have in common. For each block it computes the probability of a match with each object in its database and chooses the most likely match. For each object the robot then examines the target area to decide on an appropriate and stable placement. Again it divides a 3-D image of the target space into small chunks and computes a series of features of each chunk, taking into account the shape of the object it’s placing. The researchers train the robot for this task by feeding it graphic simulations in which placement sites are labeled as good and bad, and it builds a model of what good placement sites have in common. It chooses the chunk of space with the closest fit to that model. Finally the robot creates a graphic simulation of how to move the object to its final location and carries out those movements. These are practical applications of computer graphics far removed from gaming and animating movie monsters, Saxena notes. A robot with a success rate less than 100 percent would still break an occasional dish. Performance could be improved, the researchers say, with cameras that provide higher-resolution images, and by preprogramming the robot with 3-D models of the objects it is going to handle, rather than leaving it to create its own model from what it sees. The robot sees only part of a real object, Saxena explains, so a bowl could look the same as a globe. Tactile feedback from the robot’s hand would also help it to know when the object is in a stable position and can be released. In the future, Saxena says he’d like to add further “context,” so the robot can respond to more subtle features of objects. For example, a computer mouse can be placed anywhere on a table, but ideally it should go beside the keyboard. This work was supported a Microsoft Faculty Fellowship. Photo/credit Saxena Lab/Cornell.