Home Technology peripherals AI ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

Apr 20, 2023 pm 03:07 PM
Tik Tok machine learning

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

"In the digital age, problems can be quantitatively evaluated, and machine learning can make more intelligent and efficient optimization around goals."

On April 18, the Volcano Engine was released Develop a series of cloud products such as DPU, and launch a new version of the machine learning platform to support enterprise customers to better train large AI models. Yang Zhenyuan, Vice President of ByteDance, shared his understanding of machine learning with the theme of "Douyin's Machine Learning Practice".

Yang Zhenyuan believes that the core competitiveness of a machine learning system is that each experiment can be done quickly and cheaply. Algorithm engineers can focus on their own work and continue to try and make mistakes at a very low cost. Only in this way can agile iteration and innovation of the business be achieved. He said: "The Volcano Engine machine learning platform is unified internally and externally. Volcano Engine customers and Douyin use the same platform. I hope that these technologies polished within the company can serve more customers and support everyone in making intelligent innovations." ”

The following is the full text of Yang Zhenyuan’s speech:

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

Good morning! As we all know, Douyin and other businesses are internal customers of Volcano Engine, and they all run on the Volcano Engine cloud. Today I will share some practical experience in the company’s internal business: how the Volcano Engine supports Douyin’s use of machine learning.

First of all, let’s talk about why we need to talk about machine learning. In what scenarios and under what circumstances should we use machine learning systems? What are the challenges of using machine learning? How did we solve these challenges?

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

I thinkThe most important thing about machine learning is to digitize the problem. Digitize first, then make the problem quantitatively assessable. When the problem can be quantitatively evaluated, it can then be made intelligent and further optimized using some machine learning methods.

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

Some friends asked me before, "Zhenyuan, can you help me make a model?" I asked him what he wanted to use this model for? In fact, he didn't think clearly about it himself.

I would like to explain the use of machine learning through a few examples.

For example, in performance advertising, for merchants, can they find customers with reasonable money? For the platform, if there is an advertising space, can the most suitable advertisement be placed in this position? How to evaluate this problem? It's very simple, we just look at the conversion rate, so its goal can be clearly defined.

If you can clearly define the goal, you can conduct A/B experiments, determine which method is better, and then use machine learning to further optimize. In the end, it is often found that using manual methods, such as selecting users to do effective advertising, is difficult to do better than using machine learning.

Another example is the issuance of coupons. Which users should the same money be distributed to, which can bring longer-term retention to the platform? This is also a question that can be precisely quantified and evaluated. For such a problem, we can think about what kind of algorithm to use and what kind of machine learning to use for optimization.

Transportation capacity dispatching is a field that everyone is familiar with and can also be evaluated quantitatively through the order rate. If the matching is not good, I cannot effectively match drivers and passengers. I won’t go into details about autonomous driving. If you want to evaluate the effect in this field, there are actually more dimensions, such as safety, time, comfort, etc.

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

Having said so much, the core issue is to be able to clearly define the problem, digitize it first, and then make it intelligent.

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

#What kind of problems will there be when we use machine learning to make intelligence? There are two main problems. The first is that it is complicated and the second is that it is expensive.

Why is it complicated? Because the machine learning software stack is very deep, it requires a platform, including PyTorch platform, TensorFlow, and many other platforms. It also involves frameworks, operating systems, and underlying hardware. When everyone goes out recently, they always ask each other how many GPU cards they have. If you don't have one, you will be embarrassed to say hello to them. But in fact, many people don’t know what the efficiency of using these cards is like. Therefore, the software stack of machine learning is very deep and complex, and every link must be done correctly and well.

Let’s talk about the expensive issue. Manpower is expensive, and a very good algorithm engineer is expensive and not easy to find. In addition to talent being expensive, data is also expensive, and high-quality data costs a lot. Not to mention the hardware, everyone knows the price of high-performance GPU.

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

So, machine learning is a complex and expensive thing. So how does Douyin handle this complex and expensive matter and better use machine learning to help business development?

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

Let me briefly introduce our platform. Our two main platforms are one is a recommended advertising platform, and the other is a general platform, including CV (Computer Vision) , NLP (natural language processing) platform and so on.

Recommended platform, tens of thousands of models are trained on it every week, because we have many products and frequently train models in different scenarios. On the CV/NLP platform, the number of model training will be larger, with a training scale of approximately 200,000 models per week. Moreover, a large number of online services are running on these two platforms daily.

for example. For example, Douyin's recommendation system has many models, one of which requires 15 months of samples to train, which means that training data needs to be continuously constructed over 15 months. This amount of data is very large. But on our machine learning platform, we only need 5 hours to complete the training of this model, and the calculated cost is only 5,000 yuan. For an algorithm engineer, he trains the model in the morning and does AB experiments online in the afternoon, which greatly improves product iteration efficiency.

Whether machine learning is doing well or not, I think it can be represented by this triangle, the most important of which is the algorithm. If the algorithm takes the lead in effectiveness, it can bring great value to the business. There are two things that support the needs of algorithm effects, one is hardware ROI and the other is human ROI.

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

Hardware ROI refers to the cost per unit model. In market competition, if others spend 10,000 yuan to make a model, if you spend 10,000 yuan to make ten similar models, the battle will be stable. Human ROI refers to recruiting a powerful algorithm engineer. Whether he can maximize his potential depends mainly on whether the system can support him to try new ideas easily and quickly enough.

How to improve hardware ROI? Tide and mixed parts, these are some of the methods we commonly use. In essence, it is how to improve device utilization, which is also a basic idea of ​​cloud native. We mix different tasks together, stagger each other's peaks, and run them at a high utilization rate through intelligent scheduling. This can greatly improve resource utilization and reduce the cost of each experiment.

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

In addition to the hardware cost, there is also a very important point, which is whether the machine learning infrastructure is easy enough to use. Just kidding: Many people who do mathematics don’t like you doing computer science, especially deep learning. They say that you guys are here to “make elixirs”. You often can’t explain why your stuff is good, and why do you need to keep doing experiments? But from a practical perspective, we must continue to experiment and try. Many new discoveries in this field are made through continuous attempts.

How to make every attempt faster and cheaper, this is the core competitiveness. It is difficult to achieve a perfect model once and for all.

ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning

#What the Volcano Engine has to do is to do a good job on the platform. As you can see, the entire process of data processing, model training, evaluation, online, and AB testing is unified and integrated across the entire platform. The algorithm engineer does not need to repeatedly communicate with various links and connect various businesses. He can focus more on his own work.

Let’s look at another example. This is a very interesting special effect (TikTok AI painting). I guess many friends have used it. Around the end of last year, this special effect became particularly popular. Guess how much manpower Douyin invested in making this special effect? Many people may not have thought that the algorithm engineer invested one person, and he wrote some research codes on the platform. It took about a week to complete the training of the model, and after some adjustments, it was released online.

At that time, the product was estimated to have a peak traffic of 200QPS. We planned to launch it at 2000QPS. Unexpectedly, it would be full within a few hours of launch. We quickly did a lot of expansion, and the capacity expanded 10 times in a short period of time to support 20,000 QPS.

You can see the entire process. The number of people participating is very small, and the expansion efficiency is also very high. Many people say that model training is expensive. In fact, in the long run, the cost of inference will be significantly greater than training. The AI ​​painting model’s inference efficiency on the Volcano Engine platform is approximately five times faster than the native Pytorch model. After going online, some targeted optimizations were made, and it can be even faster, about 10 times faster, which is an order of magnitude improvement.

With such platform support, engineers can quickly try various ideas, whether it is following up on progress or pioneering innovation, they can do it quickly.

Finally, you can see that on some apps such as Douyin, Toutiao, and Dianchedi, the screen will display: Volcano Engine provides computing services. The machine learning platform we are talking about is unified internally and externally. Volcano Engine customers and Douyin use the same platform. I hope that these technologies polished within the company can serve more customers and support everyone in intelligent innovation. thank you all.

The above is the detailed content of ByteDance Yang Zhenyuan: How Douyin makes good use of machine learning. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

A complete collection of expression packs of foreign women A complete collection of expression packs of foreign women Jul 15, 2024 pm 05:48 PM

What are the emoticons of foreign women? Recently, a foreign woman's emoticon package has become very popular on the Internet. I believe many friends will encounter it when watching videos. Below, the editor will share with you some corresponding emoticon packages. If you are interested, come and take a look. A complete collection of expression packs of foreign women

Bytedance Cutting launches SVIP super membership: 499 yuan for continuous annual subscription, providing a variety of AI functions Bytedance Cutting launches SVIP super membership: 499 yuan for continuous annual subscription, providing a variety of AI functions Jun 28, 2024 am 03:51 AM

This site reported on June 27 that Jianying is a video editing software developed by FaceMeng Technology, a subsidiary of ByteDance. It relies on the Douyin platform and basically produces short video content for users of the platform. It is compatible with iOS, Android, and Windows. , MacOS and other operating systems. Jianying officially announced the upgrade of its membership system and launched a new SVIP, which includes a variety of AI black technologies, such as intelligent translation, intelligent highlighting, intelligent packaging, digital human synthesis, etc. In terms of price, the monthly fee for clipping SVIP is 79 yuan, the annual fee is 599 yuan (note on this site: equivalent to 49.9 yuan per month), the continuous monthly subscription is 59 yuan per month, and the continuous annual subscription is 499 yuan per year (equivalent to 41.6 yuan per month) . In addition, the cut official also stated that in order to improve the user experience, those who have subscribed to the original VIP

I have been honest and asked to let go of the meme introduction. I have been honest and asked to let go of the meme introduction. Jul 17, 2024 am 05:44 AM

What does it mean to be honest and let go? As an Internet buzzword, "I've been honest and begging to be let go" originated from a series of humorous discussions about rising commodity prices. This expression is now mostly used in self-deprecation or ridicule situations, meaning that individuals face specific situations (such as pressure, When you are teasing or joking), you feel that you are unable to resist or argue. Let’s follow the editor to see the introduction of this meme. Source of introduction to the meme of "Already Begging to Let It Go": "Already Begging to Let It Go" originated from "If you add a punctual treasure, you will be honest", and later evolved into "If Liqun goes up by two yuan, you will be honest" and "Iced black tea will go up by one yuan. Be honest." Netizens shouted "I have been honest and asked for a price reduction", which eventually developed into "I have been honest and asked to be let go" and an emoticon package was born. Usage: Used when breaking defense, or when you have no choice, or even for yourself

I worship you, I worship you, a complete list of emoticons I worship you, I worship you, a complete list of emoticons Jul 15, 2024 am 11:25 AM

What are some of the emoticons of "I worship you, I worship you"? The expression pack "I worship you, I worship you" originated from the "Big Brother and Little Brother Series" created by the online blogger He Diudiu Buchuudi. In this series, the elder brother helps the younger brother in time when he faces difficulties, and then the younger brother will use this line to express The extreme admiration and gratitude have formed a funny and respectful Internet meme. Let’s follow the editor to enjoy the emoticons. I worship you, I worship you, a complete list of emoticons

Introduction to the meaning of red warm terrier Introduction to the meaning of red warm terrier Jul 12, 2024 pm 03:39 PM

What is red temperature? The red-warm meme originated from the e-sports circle, specifically referring to the phenomenon of former "League of Legends" professional player Uzi's face turning red when he is nervous or excited during the game. It has become an interesting expression on the Internet to describe people's faces turning red due to excitement and anxiety. The following is Let’s follow the editor to see the detailed introduction of this meme. Introduction to the meaning of the Hongwen meme "Red Wen" as an Internet meme originated from the live broadcast culture in the field of e-sports, especially the community related to "League of Legends" (League of Legends). This meme was originally used to describe a characteristic phenomenon of former professional player Uzi (Jian proudly) in the game. When Uzi is playing, his face will become extremely rosy due to nervousness, concentration or emotion. This state is jokingly likened to the in-game hero "Rambo" by the audience.

Because he is good at introductions Because he is good at introductions Jul 16, 2024 pm 08:59 PM

What does it mean because he is good at stalking? I believe that many friends have seen such a comment in many short video comment areas. So what does it mean because he is good? Today, the editor has brought you an introduction to the meme "because he is good". For those who don’t know yet, come and take a look. The origin of the meme “because he is good”: The meme “because he is good” originated from the Internet, especially a popular meme on short video platforms such as Douyin, and is related to a joke by the well-known cross talk actor Guo Degang. In this paragraph, Guo Degang listed several reasons not to do something in a humorous way. Each reason ended with "because he is good", forming a humorous logical closed loop. In fact, there is no direct causal relationship. , but a nonsensical and funny expression. Hot memes: For example, “I can’t do it

Why is there no air conditioner in the dormitory? Why is there no air conditioner in the dormitory? Jul 11, 2024 pm 07:36 PM

Why is there no air conditioner in the dormitory? The Internet meme "Where is the air conditioning in the dormitory?" originated from the humorous complaints made by students about the lack of air conditioning in dormitories. Through exaggeration and self-deprecation, it expresses the desire for a cool and comfortable environment in the hot summer and the realistic conditions. The contrast, let’s follow the editor to take a look at the introduction of this meme. Where is the air conditioning in the dormitory? The origin of the meme: "Where is the air conditioning in the dormitory?" This meme comes from a ridicule of campus life, especially for those school dormitories with relatively basic accommodation conditions and no air conditioning. It reflects students' desire for improved accommodation conditions, especially the need for air conditioning during the hot summer months. This meme is circulated on the Internet and is often used in communication between students to humorously express frustration and frustration with the lack of air conditioning in hot weather.

Five schools of machine learning you don't know about Five schools of machine learning you don't know about Jun 05, 2024 pm 08:51 PM

Machine learning is an important branch of artificial intelligence that gives computers the ability to learn from data and improve their capabilities without being explicitly programmed. Machine learning has a wide range of applications in various fields, from image recognition and natural language processing to recommendation systems and fraud detection, and it is changing the way we live. There are many different methods and theories in the field of machine learning, among which the five most influential methods are called the "Five Schools of Machine Learning". The five major schools are the symbolic school, the connectionist school, the evolutionary school, the Bayesian school and the analogy school. 1. Symbolism, also known as symbolism, emphasizes the use of symbols for logical reasoning and expression of knowledge. This school of thought believes that learning is a process of reverse deduction, through existing

See all articles