Autonomous robots can be used to check nuclear power plants, accompany fighter planes into combat, clean up oil spills in the ocean and explore the surface of Mars.
When fed 3D models of household items in bird's-eye view (left), a new algorithm is able to guess what the objects are, and what their overall 3D shapes should be. This image shows the guess in the center, and the actual 3D model on the right. Credit: Courtesy of Ben Burchfiel
Nevertheless, these robots are still unable to make a cup of tea despite their many talents.
That is due to the fact that tasks such as switching on the stove, fetching the kettle and finding the milk and sugar demand perceptual abilities, which are still not possible for most machines.
Among them is the ability to make perceive 3D objects. While it is comparatively easy for robots to “see” objects with cameras and other sensors, inferring what they see, from a single glimpse, is more challenging.
Duke University Graduate Student Ben Burchfiel says the most advanced robots in the world cannot still perform what most children do automatically, but he and his colleagues may be reaching a solution.
Burchfiel and his Thesis Advisor George Konidaris, currently an Assistant Professor of Computer Science at Brown University, have created new technology that allows machines to make sense of 3D objects in a better and more human-like manner.
A robot that removes dishes off a table, for instance, must be able to handle a vast variety of bowls, and plates in different shapes and sizes, left disorderly on a cluttered surface.
Humans can look at a new object and naturally recognize what it is, whether it is right side up, sideways or upside down, in full view or partially hidden by other objects.
Even when an object is partially obscured, humans can mentally fill in the parts that cannot be seen.
Their robot perception algorithm can concurrently guess what a new object is, and how it is oriented, without investigating it from various angles first. It can also “imagine” any parts that are not in view.
A robot with this technology would not have to see all sides of a teapot, for instance, to perceive that it undoubtedly has a lid, a handle and a spout, and whether it is off-kilter or sitting upright on the stove.
The Researchers say their method makes fewer errors and is three times faster than the best current approaches. They recently presented their paper at the 2017 Robotics: Science and Systems Conference in Cambridge, Massachusetts.
This is a crucial step toward robots that function in conjunction with humans in homes and other real-world scenarios, which are less systematic and predictable than the extremely controlled environment of the factory floor or the lab, Burchfiel said.
With their outline, the robot is provided a limited number of training patterns and uses them to generalize to new objects.
It’s impractical to assume a robot has a detailed 3D model of every possible object it might encounter, in advance.
Ben Burchfiel, Graduate Student, Duke University
The Researchers trained their algorithm on a dataset of approximately 4,000 full 3D scans of common household objects: an assortment of beds, bathtubs, tables, desks, dressers, chairs, nightstands, sofas, monitors and toilets.
Each 3D scan was transformed into numerous little cubes, or voxels, stacked on top of each other in a similar fashion as LEGO blocks to make them easier to process.
The algorithm learned categories of objects by searching through examples of each one and guessing how they differ and how they stay the same, using a version of a method known as probabilistic principal component analysis.
When a robot comes across something new - say, a bunk bed - it does not have to search through its whole mental catalog for a match. It learns, from previous examples, what features beds usually have.
Based on that previous knowledge, it has the power to simplify like a person would - to comprehend that two objects may be different, yet share features that make them both a specific type of furniture.
To examine the approach, the Researchers fed the algorithm 908 new 3D examples of the same 10 kinds of household items (aerial view).
From this single view point, the algorithm appropriately deduced what most objects were, and what their complete 3D shapes should be, including the hidden parts, around 75% of the time - compared with just over 50% for the high-tech alternative.
It could recognize objects that were rotated in different ways, which the best competing methods cannot do.
While the system is reasonably quick - the entire process can be done in about a second - it is however a far cry from human vision, Burchfiel said.
For one, both their algorithm and earlier techniques were easily fooled by objects that, from certain standpoints, look similar in shape. They might see a table from above, and recognize it as a dresser.
Overall, we make a mistake a little less than 25 percent of the time, and the best alternative makes a mistake almost half the time, so it is a big improvement. But it still isn’t ready to move into your house. You don’t want it putting a pillow in the dishwasher.
Ben Burchfiel , Graduate Student, Duke University
Now the team is aiming to scale up their method to enable robots to differentiate between thousands of types of objects simultaneously.
"Researchers have been teaching robots to recognize 3D objects for a while now,” Burchfield said. What is new, he clarified, is the ability to both identify something and fill in the blind spots in its field of vision, to reconstruct the parts it cannot see.
“That has the potential to be invaluable in a lot of robotic applications,” Burchfiel said.
This research was supported partially by The Defense Advanced Research Projects Agency, DARPA (D15AP00104).