Meta AI researchers are teaching robots to navigate physical worlds without maps or training
Meta Platforms Inc.’s Artificial Intelligence Unit said today that it has made rapid progress in its efforts to teach AI models to navigate the physical world more easily and with less data input. coaching.
The research could drastically reduce the time it takes to teach AI models the art of visual navigation, which has traditionally only been possible through ‘reinforcement learning’ that requires massive data sets and repetition .
Meta AI researchers said their work on visual navigation for AI will have big implications for the metaverse, which will be comprised of ever-changing virtual worlds. The idea is to help AI agents navigate these worlds by seeing and exploring them, just like humans do.
“AR glasses that show us where we left our keys, for example, require fundamental new technologies that help AI understand the layout and dimensions of unfamiliar and ever-changing environments without high computational resources, like pre-provided maps,” explained Meta AI. “As humans, for example, we don’t need to know the precise location or length of our coffee table to be able to walk around it without bumping into its corners (most of the time).”
To that end, Meta has focused its efforts on “embodied AI,” which refers to the training of AI systems through interactions in 3D simulations. In this area, Meta said it has created a new “point-to-goal navigation model” that can navigate new environments without a map or GPS sensor.
The model implements a technique known as visual adometry which allows the AI to track its location based on visual inputs. Meta said this data augmentation technique can be used to quickly train efficient neural models without human data annotations. It was successfully tested in Meta’s Habitat 2.0 Embodied AI training platform that runs simulations of virtual worlds with a 94% success rate on the Realistic PointNav benchmark task, Meta said.
“Although our approach does not yet fully resolve this dataset, this research provides supporting evidence that explicit mapping may not be necessary for navigation, even in realistic contexts,” Meta said.
To further boost AI navigation training without relying on maps, Meta has created a collection of training data called Habitat-Web which offers over 100,000 different human demonstrations for object-goal navigation methods. This connects the Habitat simulator running in a web browser to Amazon.com Inc.’s Mechanical Turk service and allows users to safely teleoperate virtual robots at scale. Artificial intelligence agents trained to learn by imitation on this data can achieve “cutting edge results”, Meta said, by learning efficient object-finding behavior from humans, such as peeking into the rooms, check the corners for small objects and turn on the spot to get a panoramic view.
Additionally, Meta’s AI team has developed what they call a Plug-and-play modular approach help bots generalize from a diverse set of semantic navigation tasks and goal modalities through a unique “zero-hit experience learning framework”. The idea is to help AI agents adapt on the fly without resource-intensive maps or training. The AI model is trained to capture the skills essential for semantic visual navigation once, before applying them to different tasks in a 3D environment without any additional retraining.
Meta explained that agents are trained by looking for image goals. They receive a photo taken from a random place in their environment, then they must travel and try to locate it. “Our approach requires up to 12.5 times less training data and has a success rate up to 14% higher than state-of-the-art transfer learning,” the researchers said. Meta.
Holger Mueller, an analyst at Constellation Research Inc., told SiliconANGLE that Meta’s latest advancements could play a key role in the company’s metaverse ambitions. If virtual worlds are to become the norm, he said, AI must be able to make sense of them and do so in a way that isn’t too cumbersome.
“Scaling AI’s ability to make sense of physical worlds has to go through a software approach,” Mueller added. “That’s what Meta is doing now with its advancements in embodied AI, creating software that can make sense of its environment on its own, without any training. It will be very interesting to see the first cases of actual use in action.
These real use cases could be happening soon. Meta said the next step is to push these advances from navigation to mobile manipulation, to create AI agents that can perform specific tasks such as locating a wallet and bringing it back.