36KR learned that Suzhou Baichuan Data Technology Co., LTD. (hereinafter referred to as: Baichuan Data) has recently completed 10 million yuan angel round financing, and is exclusively invested by Tongchuangweiye. This round of financing will be mainly used for technology research and development and team building to further enhance the company’s market share.
Founded in 2021, Baichuan Data is a technology company focusing on autonomous driving AI data services, mainly providing one-stop services for data collection, cleaning, labeling, management, and storage for Oems and autonomous driving solutions, covering three scenarios: bicycle intelligence, vehicle road collaboration, and intelligent cockpit. At the same time, Baichuan Data said that after this round of financing, the company will reuse the AI visual scene processing experience accumulated in the field of autonomous driving to AIGC, intelligent robots, smart industry, smart city and other fields.
“Automated driving has entered the L2+ era, and the increasing demand, dimension and complexity of data annotation has promoted the transformation of the data annotation industry from labor-intensive to technology-driven, and automated annotation technology has attracted much attention.” At this stage, accelerating the automatic annotation of layout and improving the accuracy of automated annotation are the core of improving technical barriers and achieving differentiated competition.” Ma Dongsheng told 36kR. But at the same time, he believes that the industry as a whole is still in the transition stage of transformation at this stage, and the formation and efficient management of manual annotation teams to ensure the continuous and stable output of annotation teams is still the key to large-scale and high-quality delivery.
Therefore, the approach of Baichuan data is to use a manual + AI-assisted annotation method and develop a step-by-step strategy.
According to reports, the company’s focus at this stage is to continuously polish the closed-loop operation management system to improve the efficiency and quality of production services; The second is to build a data closed-loop system to open up data processing links such as data acquisition/storage, pre-processing, labeling, simulation, and model training/deployment. At the same time, Ma Dongsheng said that in this process, the company will gradually increase its investment in automation technology, apply the accumulated scene, data and engineering experience to the research and development of automated labeling AI models, and improve the accuracy and automation rate of automated labeling.
Specifically, on the one hand, Baichuan Data has built a closed-loop operation management system covering personnel screening, training, labeling, and quality inspection, and has built a self-tagging team of thousands of people. For the annotation project, the company will be separated into “standardized” and “non-standard” two parts, the self-operated team is responsible for completing the personalized and customized degree of non-standard business, difficult labeling, and the standardized business with high repeatability is handled by the outsourcing team under the strong control of the company. In the process of operation, Baichuan data will conduct multiple rounds of quality inspection, including the whole round of quality inspection and random inspection, to ensure the consistency of the quality of the self-built team and the outsourcing team.
Ma Dongsheng believes that the dismantling of data and tasks can help reduce costs and increase efficiency of automated labeling AI model training. He told 36kR: “As we continue to accumulate scenes, our disassembly capacity will continue to improve, and the granularity of the disassembly will become finer and finer, which can further improve the efficiency of the service.”
On the other hand, Baichuan Data has established a scenario-driven data management SaaS platform, which runs through the whole process of data collection, cleaning, labeling, management and storage. According to reports, the platform is equipped with a variety of data management tools, a collection of visual data, personnel management, performance management and other functions, can be desensitized, frame extraction, fusion, index and other data preprocessing work, and supports various types of data processing tasks including BEV and 4D.
In terms of automated annotation capability, Ma Dongsheng said that the platform currently supports AI-assisted annotation under most working conditions, weather and lane conditions, and the accuracy rate of automated annotation can meet the needs of large-scale delivery, and some scenes can improve the annotation efficiency tens of times with the assistance of automated annotation.
“Automated labeling is to solve the automatic identification problem of different customers, models, road sections, and scenarios in the industry, which means that the AI model of automated labeling also needs massive and high-quality data to feed.” Ma Dongsheng told 36KR, “We have accumulated a large number of high-complexity, high-value autonomous driving scenarios to support scenarios-driven automated annotation development.”
According to reports, up to now, Baichuan Data has served more than 30 customers, including a number of industry head Oems, full-stack solution providers, key component suppliers, and commercial scenario providers.
The advantage of building your own team is that you can control the data quality and delivery cycle, but it also brings higher costs than other operating models. In this regard, Ma Dongsheng said that the return of high costs is higher service quality and customer stickiness. In order to reduce costs, Baichuan Data is further optimizing the platform tool chain and improving operational management efficiency. In response to the latter, Baichuan Data has absorbed a number of members who have worked in Huawei, Baidu, Foxconn, BMW and other large technology enterprises for many years, and formed a core management team with rich industrial background and management experience to support subsequent business expansion.
Talking about competition, Ma Dongsheng said that because the automatic driving algorithm is still being iteratively upgraded, and the AI large model is in rapid development, the demand for data annotation is on the rise, and the market is in short supply. Therefore, at present, the company hopes to expand more differentiated businesses based on end-to-end data services, such as simulation scenario library business, to further open up the data closed-loop while delivering large-scale and stable services. On the other hand, continue to follow up the recent technical progress of algorithm companies and research institutes, use the trained data to iteratively optimize the algorithm model, improve the tool chain, and further improve the automated annotation technology.
However, a new challenge that diversified business may bring is the balance of new and old business development: the company not only needs to deepen in the field of automatic driving, but also needs to solve new problems in new business and new scenarios, such as large-scale language model-related annotation tasks, which put higher requirements on the quality of taggers.
Ma Dongsheng admitted: “It is important to clarify which area to focus superior forces, but it is certain that autonomous driving is still the company’s core focus area.” In other areas, such as robotics and AIGC, we already have some customer cooperation cases, but in the future, we will selectively expand new business, taking into account customer size, supplier friendliness, industry influence and other factors.”