Aided by A.I. Language Models, Google’s Robots Are Getting Smart
Our sneak peek into Google’s new robotics model, RT-2, which melds artificial intelligence technology with robots.
By Kevin Roose, The New York Times, July 28, 2023
…
A quiet revolution is underway in robotics, one that piggybacks on recent advances in so-called large language models — the same type of artificial intelligence system that powers ChatGPT, Bard and other chatbots.
Google has recently begun plugging state-of-the-art language models into its robots, giving them the equivalent of artificial brains. The secretive project has made the robots far smarter and given them new powers of understanding and problem-solving…
In recent years, researchers at Google had an idea. What if, instead of being programmed for specific tasks one by one, robots could use an A.I. language model — one that had been trained on vast swaths of internet text — to learn new skills for themselves?..
Google’s new robotics model, RT-2, is what the company calls a “vision-language-action” model, or an A.I. system that has the ability not just to see and analyze the world around it, but to tell a robot how to move.
It does so by translating the robot’s movements into a series of numbers — a process called tokenizing — and incorporating those tokens into the same training data as the language model. Eventually, just as ChatGPT or Bard learns to guess what words should come next in a poem or a history essay, RT-2 can learn to guess how a robot’s arm should move to pick up a ball or throw an empty soda can into the recycling bin… [end quote]
The robot’s connection to AI helps it interpret instructions like “pick up the extinct animal” to reaching for a dinosaur. It can see and place items onto a specific target, such as a German flag, using AI to distinguish the flags of different countries.
A robot that can see, refer to AI for information and program its own movements is a step change toward a smart robot.
Wendy