After 2 months, the humanoid robot Walker S can fold clothes
Machine Energy Report
Editor: Wu Xin
The domestic version of the humanoid robot teamed up with a large model to complete the operation task of complex flexible materials such as folding clothes for the first time.
With the unveiling of Figure 01, which incorporates the OpenAI multi-modal large model, the related progress of domestic peers has been attracting attention.
Just yesterday, UBTECH, the "first humanoid robot stock in China", released the first demo of the humanoid robot Walker S after it was deeply integrated with Baidu Wenxin's large model, showing some interesting new features.
Now, Walker S, blessed by Baidu Wenxin’s large model capabilities, looks like this.
Like Figure 01, Walker S does not move around, but stands behind a desk to complete a series of tasks. It can follow human commands and fold clothes.
After completing the task, you can still chat with it. For example, what should I wear with this black top? The robot still remembers that you are going on a business trip, and it is recommended to match it with dark pants, which is more suitable for formal occasions.
It will also place various switches on the table into plates.
Even if it is disturbed, such as the placed switch is thrown back on the table, or the socket that is about to be reached is removed again, Walker S can adjust the working status in real time and complete the work according to the new situation. Placement tasks.
In February, Walker S already demonstrated multi-modal perception and motion control capabilities during practical training at a new energy vehicle factory.
This time, through in-depth integration with the Wenxin large model, Walker S’s cognitive and control capabilities have reached a new level. It not only gained advanced intention understanding and fine-grained task planning capabilities, but also completed folding clothes for the first time. Such complex flexible material manipulation tasks.
Wenxin large model is Wenxin's industrial-level knowledge enhancement large model, which has cross-modal and cross-language deep semantic understanding and generation capabilities, as well as knowledge reasoning, task planning and other capabilities. By transplanting these capabilities to humanoid robots, the robot can analyze and understand the material, shape, wrinkles and other attributes of clothing like humans, and deduce the best way and sequence of folding clothes based on past experience. During the actual process of folding clothes, the robot will analyze the status changes of the clothes in real time and adjust its action strategy accordingly.
In the object interference sorting task, Walker S also gave full play to the collaborative advantages of the "AI large model robot". First, the spatial positioning and semantic information of the object is obtained through the multi-modal perception model on the device, and then the information is handed over to the large model for intelligent processing. The latter quickly builds Walker S with its excellent task dismantling and logical reasoning capabilities. Find the optimal task planning and execution path. Walker S maps this solution to the actual control of the robotic arm and dexterous hands, and finally successfully completes the entire set of complex tasks.
This move is also the first demonstration of similar capabilities among domestic peers. Its innovative application and implementation difficulty are also among the first echelon in the industry globally. "In many demonstrations, including Figure's cooperation with OpenAI and our cooperation with Baidu, end-to-end can now be achieved." UBTECH management told China Business News at last night's performance review and outlook meeting.
" We use Baidu's large model to disassemble tasks, understand natural language, and sequence logical arrangements. In addition to the multi-modal large model based on the client and side built by the company based on open source model training last year , we believe that in the future, when the competition in the humanoid robot market becomes increasingly fierce, only a strong alliance can achieve 1 1 > 2." When explaining this cooperation, UBTECH management said, "Foreign Tesla has large model capabilities and has The combination of OpenAI, NVIDIA and Figure, etc., we can see that cooperation can provide strong technical support for the implementation of humanoid robots."
However, by comparing OpenAI's videos, we found that the empowered Walker S is still different from Figure 01 There is a gap.
The most obvious thing is the speed of action. In addition, in terms of instruction content, the instructions received by Walker S are usually relatively clear and specific, while Figure 01 can convert more abstract instructions into reasonable and feasible specific operations through common sense reasoning.
In addition, Figure 01 can chat while working (especially explaining his operations), and has short-term memory ability, and can reasonably plan current actions based on the content of previous conversations.
As the competition in generative AI becomes increasingly fierce, and the research focus extends from long text and multi-modality to embodied intelligence, we have reason to believe that future humanoid robots will no longer be limited to perceiving static data, but It is the ability to move freely and interact with the environment in a virtual or even real three-dimensional world. This also marks a major leap in AI from simple machine learning to the execution of complex human-like tasks.
In fact, the humanoid robot track has shown an extremely hot momentum in the past six months, with prototypes at home and abroad frequently unveiled, and startups financing actively. In February, UBTECH exposed a video of Walker S being trialled at NIO's new energy vehicle factory. The robot can smoothly complete seat belt inspection, vehicle logo affixing and other tasks. UBTECH's share price also surged 200% in two days in early March.
However, the global humanoid robot is still in the pilot stage, and it will still take time to scale up the volume. After all, there is a big difference between demo and actual application, and the latter must comprehensively consider a series of factors such as reliability, stability, and cost. UBTECH stated that the combination of large AI models and humanoid robots will greatly improve the latter's intelligence level and adaptability to multi-scenario tasks, and accelerate its industrialization process. Founder Zhou Jian also publicly stated that he hopes to complete the first batch of humanoid robots in the factory by the end of this year, pass tests, and prepare for the mass outbreak of humanoid robots in 2025. In addition, by the end of this year, UBTECH plans to launch the first-generation home-side emotional companion humanoid robot. The robot will be equipped with a large model and can interact with users and form short-term and long-term memories.
Reference link
https://www.stcn.com/article/detail/1164967.html
THE END
Please contact this public for reprinting Authorized by No.
To submit articles or seek coverage: content@jiqizhixin.com
The above is the detailed content of After 2 months, the humanoid robot Walker S can fold clothes. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

There are many ways to center Bootstrap pictures, and you don’t have to use Flexbox. If you only need to center horizontally, the text-center class is enough; if you need to center vertically or multiple elements, Flexbox or Grid is more suitable. Flexbox is less compatible and may increase complexity, while Grid is more powerful and has a higher learning cost. When choosing a method, you should weigh the pros and cons and choose the most suitable method according to your needs and preferences.

The calculation of C35 is essentially combinatorial mathematics, representing the number of combinations selected from 3 of 5 elements. The calculation formula is C53 = 5! / (3! * 2!), which can be directly calculated by loops to improve efficiency and avoid overflow. In addition, understanding the nature of combinations and mastering efficient calculation methods is crucial to solving many problems in the fields of probability statistics, cryptography, algorithm design, etc.

The Y-axis position adaptive algorithm for web annotation function This article will explore how to implement annotation functions similar to Word documents, especially how to deal with the interval between annotations...

Discussion on using custom stylesheets in Safari Today we will discuss a custom stylesheet application problem for Safari browser. Front-end novice...

std::unique removes adjacent duplicate elements in the container and moves them to the end, returning an iterator pointing to the first duplicate element. std::distance calculates the distance between two iterators, that is, the number of elements they point to. These two functions are useful for optimizing code and improving efficiency, but there are also some pitfalls to be paid attention to, such as: std::unique only deals with adjacent duplicate elements. std::distance is less efficient when dealing with non-random access iterators. By mastering these features and best practices, you can fully utilize the power of these two functions.

How to elegantly handle the spacing of Span tags after a new line In web page layout, you often encounter the need to arrange multiple spans horizontally...

The solution to MySQL installation error is: 1. Carefully check the system environment to ensure that the MySQL dependency library requirements are met. Different operating systems and version requirements are different; 2. Carefully read the error message and take corresponding measures according to prompts (such as missing library files or insufficient permissions), such as installing dependencies or using sudo commands; 3. If necessary, try to install the source code and carefully check the compilation log, but this requires a certain amount of Linux knowledge and experience. The key to ultimately solving the problem is to carefully check the system environment and error information, and refer to the official documents.

How to make the height of adjacent columns of the same row automatically adapt to the content? In web design, we often encounter this problem: when there are many in a table or row...
