Home Technology peripherals AI Let's talk about AI noise reduction technology in real-time communication

Let's talk about AI noise reduction technology in real-time communication

Apr 12, 2023 pm 01:07 PM
ai deep learning

Let's talk about AI noise reduction technology in real-time communication

Part 01 Overview

##In real-time audio and video communication Scenario, when the microphone collects the user's voice, it also collects a large amount of environmental noise. The traditional noise reduction algorithm only has a certain effect on stationary noise (such as fan sound, white noise, circuit noise floor, etc.), and has a certain effect on non-stationary transient noise (such as a noisy restaurant). Noise, subway environmental noise, home kitchen noise, etc.) The noise reduction effect is poor, seriously affecting the user's call experience. In response to hundreds of non-stationary noise problems in complex scenarios such as home and office, the ecological empowerment team of the Department of Integrated Communications Systems independently developed AI audio noise reduction technology based on the GRU model, and through algorithm and engineering optimization, reduced the size of the noise reduction model. Compressed from 2.4MB to 82KB, the running memory is reduced by about 65%; the computational complexity is optimized from about 186Mflops to 42Mflops, and the running efficiency is improved by 77%; in the existing test data set (in the experimental environment), human voice and noise can be effectively separated , improving the call voice quality Mos score (average opinion value) to 4.25.

#This article will introduce how our team does real-time noise suppression based on deep learning and implement it on mobile terminals and Jiaqin APP. The full text will be organized as follows, introducing the classification of noise and how to choose algorithms to solve these noise problems; how to design algorithms and train AI models through deep learning; finally, it will introduce the effects and key applications of current AI noise reduction. Scenes.

Part 02 Noise classification and noise reduction algorithm selection

In real-time audio and video application scenarios, the device is in a complex acoustic environment. When the microphone collects voice signals, it also collects a large amount of noise, which is a very big challenge to the quality of real-time audio and video. There are many types of noise. According to the mathematical statistical properties of noise, noise can be divided into two categories:

Stationary noise: Statistics of noise Characteristics will not change over time over a relatively long period of time, such as white noise, electric fans, air conditioners, car interior noise, etc.;

Lets talk about AI noise reduction technology in real-time communication

Lets talk about AI noise reduction technology in real-time communication

##Non-stationary noise: The statistical characteristics of noise change over time, such as noisy restaurants, subway stations, offices, homes Kitchen etc.

Lets talk about AI noise reduction technology in real-time communication

Lets talk about AI noise reduction technology in real-time communication

In real-time audio and video applications, calls are susceptible to various types of noise interference This affects the experience, so real-time audio noise reduction has become an important function in real-time audio and video. For steady noise, such as the whirring of air conditioners or the noise floor of recording equipment, it will not change significantly over time. You can estimate and predict it and remove it through simple subtraction. Common There are spectral subtraction, Wiener filtering and wavelet transform. Non-stationary noises, such as the sound of cars whizzing by on the road, the banging of plates in restaurants, and the banging of pots and pans in home kitchens, all appear randomly and unexpectedly, and it is impossible to estimate and predict them. fixed. Traditional algorithms are difficult to estimate and eliminate non-stationary noise, which is why we use deep learning algorithms.

Part 03 Deep Learning Noise Reduction Algorithm Design

Lets talk about AI noise reduction technology in real-time communication

In order to improve the noise reduction capabilities of the audio SDK for various noise scenes and make up for the shortcomings of traditional noise reduction algorithms, we developed an AI noise reduction module based on RNN, combined with traditional noise reduction technology and deep learning technology. Focusing on noise reduction processing for home and office usage scenarios, a large number of indoor noise types are added to the noise data set, such as keyboard typing in the office, friction sounds of desks and office supplies being dragged, chair dragging, and kitchens at home. Noises, floor slams, etc.

#At the same time, in order to implement real-time speech processing on the mobile terminal, the AI ​​audio noise reduction algorithm controls the computational overhead and library size to a very low level. magnitude. In terms of computational overhead, taking 48KHz as an example, the RNN network processing of each frame of speech only requires about 17.5Mflops, FFT and IFFT require about 7.5Mflops of each frame of speech, and feature extraction requires about 12Mflops, totaling about 42Mflops. The computational complexity is approximately The 48KHz Opus codec is equivalent. In a certain brand of mid-range mobile phone models, statistics indicate that the RNN noise reduction module CPU usage is about 4%. In terms of the size of the audio library, after turning on RNN noise reduction compilation, the size of the audio engine library only increases by about 108kB.

Part 04 Network model and processing process

The The module uses the RNN model because RNN carries time information compared to other learning models (such as CNN) and can model timing signals, not just separate audio input and output frames. At the same time, the model uses a gated recurrent unit (GRU, as shown in Figure 1). Experiments show that GRU performs slightly better than LSTM on speech noise reduction tasks, and because GRU has fewer weight parameters, it can save computing resources. Compared to a simple loop unit, a GRU has two extra gates. The reset gate control state is used to calculate the new state, while the update gate control state is how much it will change based on the new input. This update gate allows GRU to remember timing information for a long time, which is why GRU performs better than simple recurrent units.

Lets talk about AI noise reduction technology in real-time communication

## Figure 1 The left side is a simple cyclic unit, the right side The structure of the GRU

model is shown in Figure 2. The trained model will be embedded into the audio and video communication SDK. By reading the audio stream of the hardware device, the audio stream will be framed and sent to the AI ​​noise reduction preprocessing module. The preprocessing module will add the corresponding features ( Feature) is calculated and output to the trained model. The corresponding gain (Gain) value is calculated through the model, and the gain value is used to adjust the signal to ultimately achieve the purpose of noise reduction (as shown in Figure 3).

Lets talk about AI noise reduction technology in real-time communication

##Figure 2. GRU-based RNN network model

Lets talk about AI noise reduction technology in real-time communication

## Figure 3. The top is the model training process, and the bottom is the real-time reduction Noise process

Part 05 AI noise reduction processing effect and implementation

Figure 4 shows the keystrokes Comparison of the speech spectrograms before and after noise reduction. The upper part is the noisy speech signal before noise reduction, and the red rectangular box is the keyboard tapping noise. The lower part is the speech signal after noise reduction. Through observation, it can be found that most of the keyboard tapping sounds can be suppressed, while the speech damage is controlled to a low level.

Lets talk about AI noise reduction technology in real-time communication

## Figure 4. Noisy speech (accompanied by Keyboard tapping sound) before and after noise reduction

The current AI noise reduction model has been launched on the mobile phone and Jiaqin to improve the mobile phone and Jiaqin APP The call noise reduction effect has excellent suppression capabilities in more than 100 noise scenarios in homes, offices, etc., while maintaining voice distortion. In the next stage, we will continue to optimize the computational complexity of the AI ​​noise reduction model so that it can be promoted and used on IoT low-power devices.

The above is the detailed content of Let's talk about AI noise reduction technology in real-time communication. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1655
14
PHP Tutorial
1252
29
C# Tutorial
1226
24
Recommended reliable digital currency trading platforms. Top 10 digital currency exchanges in the world. 2025 Recommended reliable digital currency trading platforms. Top 10 digital currency exchanges in the world. 2025 Apr 28, 2025 pm 04:30 PM

Recommended reliable digital currency trading platforms: 1. OKX, 2. Binance, 3. Coinbase, 4. Kraken, 5. Huobi, 6. KuCoin, 7. Bitfinex, 8. Gemini, 9. Bitstamp, 10. Poloniex, these platforms are known for their security, user experience and diverse functions, suitable for users at different levels of digital currency transactions

How much is Bitcoin worth How much is Bitcoin worth Apr 28, 2025 pm 07:42 PM

Bitcoin’s price ranges from $20,000 to $30,000. 1. Bitcoin’s price has fluctuated dramatically since 2009, reaching nearly $20,000 in 2017 and nearly $60,000 in 2021. 2. Prices are affected by factors such as market demand, supply, and macroeconomic environment. 3. Get real-time prices through exchanges, mobile apps and websites. 4. Bitcoin price is highly volatile, driven by market sentiment and external factors. 5. It has a certain relationship with traditional financial markets and is affected by global stock markets, the strength of the US dollar, etc. 6. The long-term trend is bullish, but risks need to be assessed with caution.

What are the top ten virtual currency trading apps? The latest digital currency exchange rankings What are the top ten virtual currency trading apps? The latest digital currency exchange rankings Apr 28, 2025 pm 08:03 PM

The top ten digital currency exchanges such as Binance, OKX, gate.io have improved their systems, efficient diversified transactions and strict security measures.

Which of the top ten currency trading platforms in the world are the latest version of the top ten currency trading platforms Which of the top ten currency trading platforms in the world are the latest version of the top ten currency trading platforms Apr 28, 2025 pm 08:09 PM

The top ten cryptocurrency trading platforms in the world include Binance, OKX, Gate.io, Coinbase, Kraken, Huobi Global, Bitfinex, Bittrex, KuCoin and Poloniex, all of which provide a variety of trading methods and powerful security measures.

Which of the top ten currency trading platforms in the world are among the top ten currency trading platforms in 2025 Which of the top ten currency trading platforms in the world are among the top ten currency trading platforms in 2025 Apr 28, 2025 pm 08:12 PM

The top ten cryptocurrency exchanges in the world in 2025 include Binance, OKX, Gate.io, Coinbase, Kraken, Huobi, Bitfinex, KuCoin, Bittrex and Poloniex, all of which are known for their high trading volume and security.

Decryption Gate.io Strategy Upgrade: How to Redefine Crypto Asset Management in MeMebox 2.0? Decryption Gate.io Strategy Upgrade: How to Redefine Crypto Asset Management in MeMebox 2.0? Apr 28, 2025 pm 03:33 PM

MeMebox 2.0 redefines crypto asset management through innovative architecture and performance breakthroughs. 1) It solves three major pain points: asset silos, income decay and paradox of security and convenience. 2) Through intelligent asset hubs, dynamic risk management and return enhancement engines, cross-chain transfer speed, average yield rate and security incident response speed are improved. 3) Provide users with asset visualization, policy automation and governance integration, realizing user value reconstruction. 4) Through ecological collaboration and compliance innovation, the overall effectiveness of the platform has been enhanced. 5) In the future, smart contract insurance pools, forecast market integration and AI-driven asset allocation will be launched to continue to lead the development of the industry.

What are the top currency trading platforms? The top 10 latest virtual currency exchanges What are the top currency trading platforms? The top 10 latest virtual currency exchanges Apr 28, 2025 pm 08:06 PM

Currently ranked among the top ten virtual currency exchanges: 1. Binance, 2. OKX, 3. Gate.io, 4. Coin library, 5. Siren, 6. Huobi Global Station, 7. Bybit, 8. Kucoin, 9. Bitcoin, 10. bit stamp.

How to use the chrono library in C? How to use the chrono library in C? Apr 28, 2025 pm 10:18 PM

Using the chrono library in C can allow you to control time and time intervals more accurately. Let's explore the charm of this library. C's chrono library is part of the standard library, which provides a modern way to deal with time and time intervals. For programmers who have suffered from time.h and ctime, chrono is undoubtedly a boon. It not only improves the readability and maintainability of the code, but also provides higher accuracy and flexibility. Let's start with the basics. The chrono library mainly includes the following key components: std::chrono::system_clock: represents the system clock, used to obtain the current time. std::chron

See all articles