3D vision helps the intelligent transformation of the robot industry
As an exciting new technology, 3D vision has already appeared in consumer products such as Microsoft Kinect and Intel RealSense. In recent years, with the continuous progress of hardware technology, the continuous optimization of algorithms and software levels, the accuracy and practicality of 3D depth vision have been greatly improved, making “3D depth camera + gesture/face recognition” have the basis for large-scale entry into mobile intelligent terminals. As the world’s leading mobile phone, Apple is the first to adopt 3D vision technology on a large scale, which will completely activate the 3D vision market and open a new era.
3D vision technology not only greatly improves the recognition accuracy, but more importantly opens a broader space for artificial intelligence applications. With the development of science and technology such as machine vision, artificial intelligence, and human-computer interaction, a variety of highly intelligent robots have begun to walk into reality, and 3D vision technology has become a good helper to help the manufacturing industry achieve “intelligent” transformation.
Familiar depth camera technologies and applications include Intel’s RealSense, Microsoft’s Kinect, Apple’s PrimeSense, and Google’s Project Tango. However, it can be seen that the research and development of this technology is mostly foreign companies, and there are only a handful of domestic computing vision companies or entrepreneurial teams, and the technical barriers are still large.
There are three main technical solutions for depth cameras on the market: binocular passive vision, structured light, and TOF. Binocular passive vision mainly uses two optical cameras to get depth information by triangulation after matching left and right stereo image pairs. This algorithm is highly complex and difficult, and the processing chip requires high computational performance, and it also inherits the shortcomings of ordinary RGB cameras: it is not applicable in dim environments and when the features are not obvious.
The principle of structured light is that the infrared laser emits a relatively random but fixed pattern of spots, which hit the object, because of the different distance from the camera, the position captured by the camera is not the same. Then calculate the displacements of the spots and the calibrated standard pattern in different positions, and introduce parameters such as camera position and sensor size to calculate the distance between the object and the camera.
Microsoft used ToF technology in Kinect 2. ToF is short for Time of flight, which literally means the time of flight. The so-called time-of-flight 3D imaging method is to continuously send light pulses to the target, and then use the sensor to receive the light returned from the object, and get the distance of the target object by detecting the flight (round trip) time of the light pulse. In contrast, structured light technology has the advantage of being more mature than ToF, lower cost, and more suitable for mobile devices such as mobile phones.
Depth camera is an essential module for all three-dimensional vision devices, with it, the device can obtain real-time three-dimensional size and depth information of the surrounding environment objects, and read the world more comprehensively. Depth camera provides basic technical support for indoor navigation and positioning, obstacle avoidance, motion capture, 3D scanning modeling and other applications, and has become a research hotspot in the industry today. Today’s iPhone X equipped with 3D depth camera is bound to vigorously promote the development of the field of machine vision, helping the robot industry to achieve a perfect “intelligent transformation.”