On April 2, Yue Jiang Robot, a manufacturer of intelligent collaborative robot arms, released its AI training robot X-Trainer.
The video shows that the X-Trainer uses imitation learning neural network + visual large language model to train for 2 hours, and obtains the ability to wash dishes independently, which saves 70% of the training time compared with the usual training time.
1C31181G01 From a plate with red food residue, a sponge placed on a yellow plate, and a metal rack with a plate hanging behind it, the task of cleaning the plate and storing it into the metal rack is deduced.
Wipe three even, do not let go of a little residual stains.
When the robot finished brushing the plate and was about to put it into the tray, it was suddenly soiled again by human intervention, but the robot quickly caught the change and reacted immediately.
What seems like a simple dish washing task, the robot has a flexible response! The full DEMO video is shown below:
After the video was released, it caused heated discussion among netizens, and looked forward to the arrival of the era of robots doing housework!
Netizens’ hot comments
Some people even joked that if the human has been bad, the robot will not always brush down, will not strike!
In fact, the X-Trainer integrates the most cutting-edge technologies of intelligent robots and AI, enabling robots to quickly imitate and learn complex human movements, and ultimately achieve behavioral cloning.
1C31181G01 Lang Xilin, co-founder of Yuejiang Technology, said that a series of actions of X-Trainer in the video come from the end-to-end control of the imitation learning neural network, which is a completely autonomous operation after training, and the robot’s smoothness and speed have been significantly improved. The whole scheme adopts the visual large language model and the imitation learning neural network.
First, the robot camera will input the top image into the visual large language model. X-Trainer can complete:
01, description of the working scene [including a dish stained with food scraps, a sponge on a yellow plate, and an iron rack behind the plate, consisting of such a kitchen scene]
02, the visual large model realizes the reasoning for the task, [the plate stained with food debris, the sponge placed on the yellow plate, and the metal rack behind the plate = the task of cleaning the plate and storing it in the metal rack]