Many people consider household chores to be an unpleasant, unavoidable part of life that is often postponed or performed with little care. Imagine a robot assistant that could assist in relieving this burden!
From the recent past, computer scientists have been making attempts to train machines to perform a broader range of household tasks. In an innovative study led by MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the University of Toronto, scientists have demonstrated “VirtualHome,” a system with the ability to simulate detailed household tasks and then make artificial “agents” to perform them, making it possible to train robots one day to perform such tasks.
Almost 3000 programs of different activities were used by the researchers to train the system. These activities are further simplified into subtasks for the computer to perceive. For instance, a simple task such as “making coffee” would also include the step of “grabbing a cup.” The team demonstrated VirtualHome in a 3D realm inspired by the Sims video game.
The artificial agent developed by the researchers has the ability to perform 1000 of these interactions in the Sims-style realm, with eight distinctive scenes, such as a home office, kitchen, dining room, living room, and bedroom.
Describing actions as computer programs has the advantage of providing clear and unambiguous descriptions of all the steps needed to complete a task. These programs can instruct a robot or a virtual character, and can also be used as a representation for complex tasks with simpler actions.
Xavier Puig, MIT PhD student and lead author of the paper
CSAIL and the University of Toronto co-developed the project in association with scientists from McGill University and the University of Ljubljana. The paper will be presented at the Computer Vision and Pattern Recognition (CVPR) conference, to be held in Salt Lake City from June 18th to 22nd.
In contrast to humans, robots require clearer instructions to finish easy tasks; they do not have the ability to infer and reason with ease.
For instance, one might instruct a human to “switch on the TV and watch it from the sofa.” In this case, instructions such as “grab the remote control” and “sit/lie on sofa” have been omitted as they are part of the commonsense knowledge possessed by humans.
To be able to better illustrate these types of tasks to robots, the descriptions for such actions have to be even more detailed. In order to achieve this, the researchers first gathered verbal descriptions of household activities, and then transformed them into simple code. A program such as this might include steps such as the following: walk to the television, switch on the television, walk to the sofa, sit on the sofa, and watch television.
After developing the programs, the researchers fed them to the VirtualHome 3D simulator to be converted into videos. Subsequently, a virtual agent would perform the tasks specified by the programs, such as placing a pot on the stove, watching television, or turning a toaster on and off.
The end outcome is not only a system for teaching robots to perform chores but also a large database of household chores defined with the help of natural language. Companies such as Amazon striving to create Alexa-like robotic systems at home could ultimately use data such as these to teach their models to perform more complicated tasks.
The model developed by the team successfully exhibited that their agents had the ability to learn to reconstruct a program, and hence carry out a task, when given either a description—such as “pour milk into glass”—or a video demonstration of the activity.
This line of work could facilitate true robotic personal assistants in the future. Instead of each task programmed by the manufacturer, the robot can learn tasks just by listening to or watching the specific person it accompanies. This allows the robot to do tasks in a personalized way, or even some day invoke an emotional connection as a result of this personalized learning process.
Qiao Wang, Research Assistant in Arts, Media and Engineering, Arizona State University
In the future, the researchers look forward to teaching the robots with the help of actual videos rather than Sims-style simulation videos, which would allow a robot to simply learn by watching a YouTube video. They are also striving to implement a reward-learning system through which the agent receives positive feedback upon performing tasks accurately.
“You can imagine a setting where robots are assisting with chores at home and can eventually anticipate personalized wants and needs, or impending action,” stated Puig. “This could be especially helpful as an assistive technology for the elderly, or those who may have limited mobility.”