A recent video released by Figure AI shows that the humanoid robot Figure 01 can recognize objects in front of it, answer open questions, and follow instructions to put black plastic bags into a box and cups and plates on a drain rack.
REF620E_F NBFNAANNNCC1BNN1XF Brett Adcock, founder and CEO of Figure, emphasized that Figure 01 in the video shows a conversation with a human in an end-to-end neural network framework, without any remote control. At the same time, the speed of robots has increased significantly, beginning to approach the speed of humans.
Can recognize, can talk, can think
“Figure 01, what do you see now?”
“I see a red apple in the middle of the table, a drain basket, some cups and plates, and you standing nearby with your hand on the table.”
It can not only recognize static objects, but also judge human movements, which can be seen in Figure 01’s visual recognition and understanding ability.
In subsequent rounds of dialogue, Figure 01 continues to demonstrate its capabilities in context tracing, reasoning, and multitasking.
When asked what was on the table that could be eaten, Figure 01 answered, “Of course,” and then smoothly handed the red apple to the questioner.
The questioner then dumps several black plastic bags on the table and asks Figure 01 to collect the trash while explaining why he just handed him the apple. Figure 01 then gives a quick response. He puts the plastic bag back in the trash basket with both hands and replies, “I just gave you an apple because it was the only thing on the table to eat.”
REF620E_F NBFNAANNNCC1BNN1XF Next, the questioner gave an open-ended question, asking Figure 01 to determine where the plate should be placed based on the scene in front of him. Figure 01 shows that the cups and plates should probably be placed in the drain basket, and the cups and plates are placed in the drain basket according to the instructions.
In the Q&A, Figure 01 also reveals its ability in contextual reasoning. When the questioner asked, “Can you put them there?” It does not specify what kind of object is placed where. Figure 01 accurately determines the meaning of the pronoun according to the context, and completes the corresponding instructions.
Figure 01 concludes with a self-assessment of the above performance: “I think I did a good job, the apple found a new owner, the trash was thrown out, and the cutlery was put in its proper place.”
OpenAI offers’ intellectual support ‘
REF620E_F NBFNAANNNCC1BNN1XF Founded in 2022, Figure officially launched the company’s first humanoid robot, Figure 01, in October 2023. The Figure 01 stands 5 feet 6 inches tall, weighs 132 pounds and is designed to carry up to 44 pounds and fly for up to five hours, according to officials.
Not long ago, Figure announced the completion of a new financing, the amount of $675 million (about 4.85 billion yuan), investors include Nvidia, OpenAI, Microsoft and other technology giants and Amazon founder Bezos. As part of the agreement, Figure AI will work with OpenAI to develop AI models for the next generation of humanoid robots.
Brett Adcock called the partnership with OpenAI “a milestone in robotics.” He hopes the collaboration will eventually allow robots to work side by side with humans, complete tasks and hold conversations. In theory, the ability to understand language and act accordingly could allow robots to better work with warehouse workers or take verbal commands.
The partnership with OpenAI will also help Figure AI’s robots self-correct and learn from past mistakes. Brett Adcock says Figure AI’s robot already has the ability to speak and can use its camera to describe what it “sees” in front of it, as well as what is likely to happen in a particular area over a period of time. The latest video released by Figure certainly demonstrates these capabilities.
In the domestic aspect, many humanoid robot ontology manufacturers are also strengthening the combination of AI large models. In 2023, Datarrobotics released the generative AI platform RobotGPT as well as Hydero AGI and Hydero OS 5.1, and connected the humanoid robot to RobotGPT, so that the robot can not only carry out multiple rounds of dialogue with the audience, but also play a variety of roles. Huang Xiaoqing, founder, chairman and CEO of Da Yu Robot, firmly believes that the future robot will be a technical system based on the “cloud (cloud brain), network (security network), end (robot terminal)” architecture.
In October 2023, iFlytek launched a humanoid robot developed in cooperation with Yushu Technology. Liu Qingfeng, chairman of iFlytech, said that the launch of Spark large model will allow the development of AIBOT empowered robots to enter a new stage, humanoid robots to dismantle complex tasks, open scene object search significantly improved, in the strengthening learning generalization grab, imitation of human walking complex terrain movement ability than the mainstream system has a very large improvement.
“In the next step, we will use the humanoid robot as a guide to promote the ‘visual-speech-action’ multimodal embodied large model, which can better empower humanoid robots.” Liu Qingfeng said.