AI empowerment is expected to accelerate industrial change
Industry trend: Industrial robots are evolving towards embodied intelligence
ChatGPT moves humans from in-the-loop to on-the-loop in a robotic process. Currently, robotic pipelines require a dedicated engineer to write code in-the-loop to improve the process. The introduction of ChatGPT can replace the human position in the loop, and human (technical or non-technical) users can interact with the language model through high-level language commands in the form of on-the-loop, enabling seamless deployment of various platforms and tasks.
Human users evaluate the quality and safety of ChatGPT output in the robot pipeline. Human tasks in the robot pipeline mainly include: 1) Define the high-level robot function library. On the one hand, the high-level robot function library is oriented to the robot platform and can call and guide the related actions of robots. On the other hand, for ChatGPT, the library functions should be named in a way that ChatGPT can understand and follow. 2) Build prompt. Pompt describes the task objective and identifies the functions in the high-level library that ChatGPT is allowed to use. You can also include constraint information, or tell the ChatGPT how to organize its response. 3) Analyze and evaluate ChatGPT output results and give feedback. Users evaluate the code output by ChatGPT through direct analysis or simulation in an on-the-loop format and provide feedback to ChatGPT on the quality and security of the output code. 4) Iteration. Iterate on the results generated by chatgpt until they meet human expectations and ensure that the final code can be deployed for execution on a robot.
Simple tasks: ChatGPT is able to solve simple robot tasks in a zero-shot manner. For simple robot tasks, users only need to provide prompt and function library descriptions, and do not need to provide specific code examples, ChatGPT can be zero-shot to solve problems such as spatio-temporal reasoning, control of real drones, and drone industry inspection. 1) Spatiotemporal reasoning: ChatGPT is required to control a flat robot that uses visual servos to capture the position of the basketball. 2) Real world drone flight: Use ChatGPT and API to control a real drone and complete the object finding task. 3) AirSim industrial inspection: Based on AirSim simulator, ChatGPT is used to control the simulation domain UAV for industrial inspection.
Complex tasks: ChatGPT is capable of more complex robot control tasks with on-the-loop human user interaction. For more complex problems, ChatGPT cannot be zero-shot or the completion effect is limited. In this case, human users can assist ChatGPT with text feedback interaction to complete tasks such as course learning and AirSim obstacle avoidance. 1) Course learning: Teach ChatGPT simple skills of picking and placing objects, and logically combine the learned skills for more complex block arrangement tasks. 2) AirSim obstacle avoidance: ChatGPT builds most of the key modules of the obstacle avoidance algorithm, but human workers still need to feedback some information such as the orientation of the drone. The human feedback is all high-level natural language, but ChatGPT is able to understand and make code corrections where appropriate.